L4-
Learning by Recording Cases
Using the consistency heuristics (similar in Pattern Recognition)
The consistency heuristics:
Whenever you want to guess a property of something, given nothing else to go on but a set of reference cases, find the most similar case, as measured by known properties, for which the property is known. Guess that the unknown property is the same as that known property.
Example:
Given 8 blocks of known colour, width and height as follows. Find the
colour of a new block of size 1cm ´
4cm.
Nearest Neighbour
Calculate the distance to each block, and find the minimum O(n).
Use decision tree instead O(log n).
A decision tree is a semantic tree such that
each node is connected to a set of possible answers
Each non-leaf node is connected to a test that splits its set of possible answers into subsets corresponding to different test results.
Each branch carries a particular test result's subset to another node.
A k-d tree is a decision tree (distance measured in k-dimension)
The set of possible answers consists of points, one of which may be the nearest neighbour to a given point.
Each test specifies a coordinate, a threshold, and a neutral zone around the threshold containing no point
Each test divides a set of points into two sets, according to on which side of the threshold each point lies.
Building of the k-d tree:
If there are only one case, stop.
If there is the first division of cases, pick the vertical axis for comparison; otherwise, pick the axis that is different from the axis at the next higher level.
Considering only the axis of comparison, find the average position of the two middle objects. Call this average position the threshold, and construct a decision-tree test that compares unknowns in the axis of comparison against the threshold. Also note the position of the two middle objects in the axis of comparison. Call these position the upper and lower boundaries.
Divide up all the objects into two subsets, according to on which side of the average position they lie.
Divide up the objects in each subset, forming a subtree for each, using this procedure.
The resulting decision tree:
To find the nearest neighbour using the k-d procedure:
Determine if there is only one element in the set under consideration.
If there is only one, report it.
Otherwise, compare the unknown, in the axis of comparison, against the current node's threshold. The result determines the likely set.
Find the nearest neighbour in the likely set using this procedure.
Determine whether the distance to the nearest neighbour in the likely set is less than or equal to the distance to the other set's boundary in the axis of comparison:
If it is, then report the nearest neighbour in the likely set.
If it is not, check the unlikely set using this procedure; return the nearer of the nearest neighbours in the likely set and in the unlikely set.