Topics covered in the Workshop
- Introduction : Definition of learning systems. Goals and applications of machine learning. Aspects of developing a learning system: training data, concept representation, function approximation.
- Inductive Classification : The concept learning task. Concept learning as search through a hypothesis space. General-to-specific ordering of hypotheses.
- Decision Tree Learning: Representing concepts as decision trees. Recursive induction of decision trees. Picking the best splitting attribute: entropy and information gain. Searching for simple trees and computational complexity. Occam’s razor. Over fitting, noisy data, and pruning.
- Ensemble Learning :Using committees of multiple hypotheses. Bagging, boosting, and DECORATE. Active learning with ensembles.
- Experimental Evaluation of Learning Algorithms: Measuring the accuracy of learned hypotheses. Comparing learning algorithms: cross-validation, learning curves, and statistical hypothesis testing.
- Computational Learning Theory: Models of learnability: learning in the limit; probably approximately correct (PAC) learning. Sample complexity: quantifying the number of examples needed to PAC learn.
- Rule Learning: Propositional and First-Order : Translating decision trees into rules. Heuristic rule induction using separate and conquer and information gain.
- Artificial Neural Networks: Neurons and biological motivation. Linear threshold units. Perceptrons: representational limitation and gradient descent training.
- Support Vector Machines: Maximum margin linear separators. Quadractic programming solution to finding maximum margin separators. Kernels for learning non-linear functions.
- Bayesian Learning: Probability theory and Bayes rule. Naive Bayes learning algorithm.
- Instance-Based Learning: Constructing explicit generalizations versus comparing to past specific examples. k-Nearest-neighbor algorithm. Case-based learning.
- Text Classification: Bag of words representation. Vector space model and cosine similarity. Relevance feedback and Rocchio algorithm.
- Clustering and Unsupervised Learning: Learning from unclassified data. Clustering. Hierarchical Agglomerative Clustering. k-means partitional clustering.
- Language Learning: Classification problems in language: word-sense disambiguation, sequence labeling. Hidden Markov models (HMM’s).
LAB
- We will use Rstudio and Rpackage for the Practice
- We will cover one real time project using Machine learning.