John D. Lafferty, Andrew McCallum, Fernando C. N. Pereira
18th International Conference on Machine Learning, 2001
We present conditional random fields, a framework for building probabilistic models to segment and label sequence data. Conditional random fields offer several advantages over hidden Markov models and stochastic grammars for such tasks, ...
Christopher J. C. Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, Gregory N. Hullender
22nd International Conference on Machine Learning, vol. 119,2005
We investigate using gradient descent methods for learning ranking functions; we propose a simple probabilistic cost function, and we introduce RankNet, an implementation of these ideas using a neural network to model the underlying ranking function. ...
14th International Conference on Machine Learning, 1997
This paper is a comparative study of feature selection methods in statistical learning of text categorization. The focus is on aggressive dimensionality reduction. Five methods were evaluated, including term selection based on document frequency ...
16th International Conference on Machine Learning, 1999
This paper introduces Transductive Support Vector Machines (TSVMs) for text classification. While regular Support Vector Machines (SVMs) try to induce a general decision function for a learning task, Transductive Support Vector Machines take into ...
Yoav Freund, Robert E. Schapire
13th International Conference on Machine Learning, 1996
In an earlier paper, we introduced a new Â“boostingÂ” algorithm called AdaBoost which, theoretically, can be used to significantly reduce the error of any learning algorithm that consistently generates classifiers whose performance is a little better ...
17th International Conference on Machine Learning, 2000
Despite its popularity for general clustering, K-means suffers three major shortcomings; it scales poorly computationally, the number of clusters K has to be supplied by the user, and the search is prone to local minima. We propose solutions for ...
14th International Conference on Machine Learning, 1997
The proliferation of topic hierarchies for text documents has resulted in a need for tools that automatically classify new documents within such hierarchies. Existing classification schemes which ignore the hierarchical structure and treat the topics ...
Jason D. M. Rennie, Nathan Srebro
22nd International Conference on Machine Learning, vol. 119,2005
Maximum Margin Matrix Factorization (MMMF) was recently suggested (Srebro et al., 2005) as a convex, infinite dimensional alternative to low-rank approximations and standard factor models. MMMF can be formulated as a semi-definite programming (SDP) ...
Xiaojin Zhu, Zoubin Ghahramani, John D. Lafferty
20th International Conference on Machine Learning, 2003
An approach to semi-supervised learning is proposed that is based on a Gaussian random field model. Labeled and unlabeled data are represented as vertices in a weighted graph, with edge weights encoding the similarity between instances.