Victor Lavrenko: YouTube

Naive Bayes Classifier

Naive Bayes Classifier

  • IAML5.1: Overview
  • IAML5.2: Bayesian classification
  • IAML5.3: Class model and the prior
  • IAML5.4: Role of denominator in Naive Bayes
  • IAML5.5: Probabilistic classifiers: generative vs discriminative
  • IAML5.6: Independence assumption in Naive Bayes
  • IAML5.7: Mutual independence vs conditional independence
  • IAML5.8: Naive Bayes for real-valued data
  • IAML5.9: Gaussian Naive Bayes classifier
  • IAML5.10: Naive Bayes decision boundary
  • IAML5.11: Example where Naive Bayes fails
  • IAML5.12: Naive Bayes for spam detection
  • IAML5.13: The zero-frequency problem
  • IAML5.14: Missing values in Naive Bayes

Decision Tree Learning

Decision Tree Learning

  • IAML7.1 Decision Trees: an introduction
  • IAML7.2 Decision tree example
  • IAML7.3 Quinlan’s ID3 algorithm
  • IAML7.4 Decision tree: split purity
  • IAML7.5 Decision tree entropy
  • IAML7.6 Information gain
  • IAML7.7 Overfitting in decision trees
  • IAML7.8 Decision tree pruning
  • IAML7.9 Information gain ratio
  • IAML7.10 Decision trees are DNF formulas
  • IAML7.11 Decision trees and real-valued data
  • IAML7.12 Decision tree regression
  • IAML7.13 Pros and cons of decision trees
  • IAML7.14 Random forest algorithm
  • IAML7.15 Summary

Generalization and Evaluation

Generalization and Evaluation

  • IAML8.1 Generalization in machine learning
  • IAML8.2 Overfitting and underfitting
  • IAML8.3 Examples of overfitting and underfitting
  • IAML8.4 How to control overfitting
  • IAML8.5 Generalization error
  • IAML8.6 Estimating the generalization error
  • IAML8.7 Confidence interval for generalization
  • IAML8.8 Why we need validation sets
  • IAML8.9 Cross-validation
  • IAML8.10 Leave-one-out cross-validation
  • IAML8.11 Stratified sampling
  • IAML8.12 Evaluating classification and regression
  • IAML8.13 False positives and false negatives
  • IAML8.14 Classification error and accuracy
  • IAML8.15 When classification error is wrong
  • IAML8.16 Recall, precision, miss and false alarm
  • IAML8.17 Classification cost and utility
  • IAML8.18 Receiver Operating Characteristic (ROC) curve
  • IAML8.19 Evaluating regression: MSE, MAE, CC
  • IAML8.20 Mean squared error and outliers
  • IAML8.21 Mean absolute error (MAE)
  • IAML8.22 Correlation coefficient

k-Nearest Neighbor Algorithm

k-Nearest Neighbor Algorithm

  • kNN.1 Overview
  • kNN.2 Intuition for the nearest-neighbor method
  • kNN.3 Voronoi cells and decision boundary
  • kNN.4 Sensitivity to outliers
  • kNN.5 Nearest-neighbor classification algorithm
  • kNN.6 MNIST digit recognition
  • kNN.7 Nearest-neighbor regression algorithm
  • kNN.8 Nearest-neighbor regression example
  • kNN.9 Number of nearest neighbors to use
  • kNN.10 Similarity / distance measures
  • kNN.11 Breaking ties between nearest neighbors
  • kNN.12 Parzen windows, kernels and SVM
  • kNN.13 Pros and cons of nearest-neighbor methods
  • kNN.14 Computational complexity of finding nearest-neighbors
  • kNN.15 K-d tree algorithm
  • kNN.16 Locality sensitive hashing (LSH)
  • kNN.17 Inverted index

K-means Clustering

K-means Clustering

  • Clustering 1: monothetic vs. polythetic
  • Clustering 2: soft vs. hard clustering
  • Clustering 3: overview of methods
  • Clustering 4: K-means clustering: how it works
  • Clustering 5: K-means objective and convergence
  • Clustering 6: how many clusters?
  • Clustering 7: intrinsic vs. extrinsic evaluation
  • Clustering 8: alignment and pair-based evaluation
  • Clustering 9: image representation

IR15 Web Search and PageRank

IR15 Web Search and PageRank

  • Web search 1: more data = higher precision
  • Web search 2: big data beats clever algorithms
  • Web search 3: introduction to PageRank
  • Web search 4: PageRank algorithm: how it works
  • Web search 5: PageRank at convergence
  • Web search 6: PageRank using MapReduce
  • Web search 7: sink nodes in PageRank
  • Web search 8: hubs and authorities
  • Web search 9: link spam
  • Web search 10: anchor text

IR7 Inverted Indexing

IR7 Inverted Indexing

  • Indexing 1: what makes google fast
  • Indexing 2: inverted index
  • Indexing 3: sparseness and linear merge
  • Indexing 4: phrases and proximity
  • Indexing 5: XML, structure and metadata
  • Indexing 6: delta encoding (compression)
  • Indexing 7: v-byte encoding (compression)
  • Indexing 8: doc-at-a-time query execution
  • Indexing 9: doc-at-a-time worst case
  • Indexing 10: term-at-a-time query execution
  • Indexing 11: query execution tradeoffs
  • Indexing 12: expected cost of execution
  • Indexing 13: heuristics for faster search
  • Indexing 14: structured query execution
  • Indexing 15: index construction
  • Indexing 16: MapReduce
  • Indexing 17: distributed search

IR13 Evaluating Search Engines

IR13 Evaluating Search Engines

  • Evaluation 1: overview
  • Evaluation 2: research hypotheses
  • Evaluation 3: effectiveness vs. efficiency
  • Evaluation 4: Cranfield paradigm
  • Evaluation 5: relevance judgments
  • Evaluation 6: precision and recall
  • Evaluation 7: why we can’t use accuracy
  • Evaluation 8: F-measure
  • Evaluation 9: when recall/precision is misleading
  • Evaluation 10: recall and precision over ranks
  • Evaluation 11: interpolated recall-precision plot
  • Evaluation 12: mean average precision
  • Evaluation 13: MAP vs NDCG
  • Evaluation 14: query logs and click deviation
  • Evaluation 15: binary preference and Kendall tau
  • Evaluation 16: hypothesis testing
  • Evaluation 17: statistical significance test
  • Evaluation 18: the sign test
  • Evaluation 19: training / testing splits

IR10 Crawling the Web

IR10 Crawling the Web

  • Web crawling 1: sources of data
  • Web crawling 2: blogs, tweets, news feeds
  • Web crawling 3: the algorithm
  • Web crawling 4: inside an HTTP request
  • Web crawling 5: robots.txt
  • Web crawling 6: keeping index fresh

Leave a Reply

Your email address will not be published. Required fields are marked *