Vathanak Mao's Tech Notes: September 2016

Monday, September 5, 2016

Depth First Search, Breadth First Search and Uniform Cost Search Algorithms

Before knowing Uniform Cost Search (UCS) algorithm, we should know about Depth First Search (DFS) and Breadth First Search (BFS).

DFS is an algorithm to search a graph for a path from one vertex (node) to another. It just tells you if there is a path or not, nothing more. There can be many paths from one vertex to another. It will just choose one that it meets first; it doesn't care about how long the path is. The pseudocode to implement this algorithm uses an array variable to store the nodes that have been already explored so that the search won't go back to the already explored nodes. We still another (Map) variable to trackback the path or to know the parent or preceding node of the current node. There can be more than parent or preceding nodes but there is only one that the search crosses to reach the current node.

BFS is also an algorithm to search for a path in a graph or tree. It is based on DFS but it can find the shortest path. In processing, DFS uses stack data structure but BFS uses queue data structure. In the pseudocode from my teacher below looks like it does not have anything to do with path cost. And the only difference between it and DFS's pseudocode is frontier variable uses queue, not stack.

UCS use priority queue to store the explored nodes and their cumulative path costs so that it can find the shortest path. I'm not sure how BFS is different from UCS. The algorithm below is from my teacher. In the third line from top and another third line from bottom, it should have said cumulative PATH-COST instead of only PATH-COST.

My homework is to implement DFS, BFS and then UCS in R programming. The source code can be found here https://github.com/vathanakmao/ai-assignments

Sunday, September 4, 2016

Heapsort Algorithm

What is heap?

Heap is an untidy pile of mass of things, for example, a heap of clothes/rubbish. In data structure, a heap is tree-based data structure that satisfies the heap property: if A is a parent of node B then the key (the value) of node A is ordered with the respect of the key of node B with the same ordering applying to the heap.

Heap can be divided into 2 categories: max heap and min heap. In max heap, the max-heap property means that the keys of parent nodes are always greater than or equal to those of the children and the highest key is in the root node. In min heap, the min-heap property means that the keys of parent nodes are always less than or equal to those of the children and the lowest key is in the root node.

Algorithm

The items are stored in an array but we logically convert it to a tree.

The below algorithm of how to find the indices of the left and right children of a (parent) node. For example, node 16 has two children, node 14 and node 10.

The max-heapify operation (function) makes sure the child nodes are not greater than their parents but it doesn't go through all the nodes, just one path straight down to the children. We need another function to call this function in a loop (build-max-heap operation).

Below is the example of how max-heapify operation make change to the tree.

The build-max-heap operation (function) makes sure all nodes in the binary tree satisfy the max-heap property. In other words, the keys of child nodes must be less than or equals to the key of parent node.

Below is the example of how the build-max-heap operation make change to the tree.

The heapsort() function below will sort the tree completely.

Below is the example of how the tree is sorted step by step.

I have implemented it in Java for learning purpose. You can find the source code here: https://github.com/vathanakmao/sort-algorithms.git

References
Book: Introduction to Algorithms (3rd edition), Thomas H. Cormen, Charles E. Leiserson, ...

Navigation