- Published a comprehensive introduction to conditional random fields (CRFs).
- Wrote an introductory article on information extraction for those unfamiliar with machine learning.
- Created MALLET, a Java-based machine learning toolkit for natural language processing.
- Analyzed topical trends over five years of ICML conferences before 2008.
- Three of his papers are among the most cited in computer science according to CiteSeer.
Research Experience
- Developed FACTORIE, a toolkit for deployable probabilistic modeling.
- Served as General Chair of ICML 2012.
- Launched Rexa, a new search engine for research papers, grants, people, and topics.
Background
Main research goal is to dramatically increase the ability to mine actionable knowledge from unstructured text. Particularly interested in information extraction from the Web, understanding connections between people and organizations, expert finding, social network analysis, and mining scientific literature & community. Methods used include statistical machine learning, natural language processing, information retrieval, and data mining, with a tendency towards probabilistic approaches and graphical models.
Miscellany
- Building an 'open reviewing' system for ICLR 2013 and other venues.
- Interested in alternative approaches to peer review.