Carnegie Mellon University, Pittsburgh, USA
Conference on Computational Learning Theory, COLT 1998, 1998 – COLT
We consider the problem of using a large unlabeled sample to boost performance of a learning algorit,hrn when only a small set of labeled examples is available. In particular, we consider a problem setting motivated by the task of learning to ...
Avrim Blum, Katrina Ligett, Aaron Roth
Proceedings of the 40th Annual ACM Symposium on Theory of Computing, 2008 – STOC
We demonstrate that, ignoring computational constraints, it is possible to release privacy-preserving databases that are useful for all queries over a discretized domain from any given concept class with polynomial VC-dimension. We show a new lower ...
Avrim Blum, Cynthia Dwork, Frank McSherry, Kobbi Nissim
twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, 2005 – PODS
We consider a statistical database in which a trusted administrator introduces noise to the query responses with the goal of maintaining privacy of individual database entries. In such a database, a query consists of a pair (S, f) where S is a set of ...
Artificial Intelligence, vol. 97,no. 1-2,1997 – AI
In this survey, we review work in machine learning on methods for handling data sets containing large amounts of irrelevant information. We focus on two key issues: the problem of selecting relevant features, and the problem of selecting relevant ...
Avrim Blum, Merrick L. Furst, Jeffrey C. Jackson, Michael J. Kearns, Yishay Mansour, Steven Rudich
Proceedings of the Twenty-Sixth Annual ACM Symposium on Theory of Computing, 1994 – STOC
We present new results, both positive and negative, on the well-studied problem of learning disjunctive normal form (DNF) expressions. We first prove that an algorithm due to Kushilevitz and Mansour [16] can be used to weakly learn DNF using ...
Nikhil Bansal, Avrim Blum, Shuchi Chawla
Machine Learning, vol. 56,no. 1-3,2004 – ML
We consider the following clustering problem: we have a complete graph on n vertices (items), where each edge (u, v) is labeled either + or – depending on whether u and v have been deemed to be similar or different. The goal is to produce a ...
18th International Conference on Machine Learning, 2001 – ICML
Many application domains suffer from not having enough labeled training data for learning. However, large...
Artificial Intelligence, vol. 90,no. 1-2,1997 – AI
We introduce a new approach to planning in STRIPS-like domains based on constructing and analyzing a compact structure we call a Planning Graph. We describe a new planner, Graphplan, that uses this paradigm. Graphplan always returns a ...
Avrim Blum, Adam Tauman Kalai, Hal Wasserman
Journal of the ACM, vol. 50,no. 4,2003 – JACM
We describe a slightly subexponential time algorithm for learning parity functions in the presence of random classification noise, a problem closely related to several cryptographic and coding problems. Our algorithm runs in polynomial time for the ...