In order to deal with an increasingly complex world, we need ever more sophisticated computational models that can help us make decisions wisely and understand the potential consequences of choices. But creating a model requires far more than just raw data and technical skills. In this phase, I attempt to understand more about the whole process of decision making by modelling primitive data and testing different algorithms. It provides visual insights to different technical algorithms and how they evaluate data. Computational models can help translate observations into an anticipation of future events, act as a testbed for ideas, extract value from data and ask questions about behaviours. Here I used the programm Wekinator to operate different pre-written algorithms and visualized the results with Processing.
Basic Decision Stump A decision stump is a Decision Tree, which uses only a single attribute for splitting. For discrete attributes, this typically means that the tree consists only of a single interior node (i.e., the root has only leaves as successor nodes). If the attribute is numerical, the tree may be more complex.
Decision Tree Decision Tree algorithm belongs to the family of supervised learning algorithms. Unlike other supervised learning algorithms, the decision tree algorithm can be used for solving regression and classification problems too. Decision trees are constructed via an algorithmic approach that identifies ways to split a data set based on different conditions. It is one of the most widely used and practical methods for supervised learning.
K-Nearest-Neighbour (KNN) It can be used for both classification and regression problems. However, it is more widely used in classification problems in the industry. K nearest neighbors is a simple algorithm that stores all available cases and classifies new cases by a majority vote of its k neighbors. The case being assigned to the class is most common amongst its K nearest neighbors measured by a distance function.
Decision Tree with specified attribute classification As seen in the example, the Decision Tree can be also used to create a split along a certain axis (either y- or x-axis). Constructing a decision tree and its classification is all about finding an attribute that returns the highest information gain and the smallest entropy.
Adaboost AdaBoost is vastly used in face detection to assess whether there is a face in the video or not. Abstract Boosting is an approach to machine learning based on the idea of creating a highly accurate prediction rule by combining many relatively weak and inaccurate rules.
Naive Bayes It is a classification technique based on Bayes’ theorem with an assumption of independence between predictors. In simple terms, a Naive Bayes classifier assumes that the presence of a particular feature in a class is unrelated to the presence of any other feature. For example, a fruit may be considered to be an apple if it is red, round, and about 3 inches in diameter. Even if these features depend on each other or upon the existence of the other features, a naive Bayes classifier would consider all of these properties to independently contribute to the probability that this fruit is an apple.