picture

Home
Publications
Software
Curriculum vitae

Email:

Office:
Jolley Hall #537
CSE department
Washington University in St. Louis
Saint Louis, MO 63130

Mail:
One Brookings Drive
Campus Box 1045
Saint Louis, MO 63130

Welcome to Minmin Chen's homepage



About me

I moved to Google in fall 2017. Before that, I was a staff researcher at Criteo Lab, starting March 2014. I worked as a machine learning scientist at Amazon, on the Amazon Go project. I obtained my Ph.D from the Department of Computer Science and Engineering at Washington University in St. Louis. My advisor is Prof. Kilian Weinberger. Before joining Wash U, I obtained my B.S. degree in the Department of Electronic and Communications from Tsinghua University, Beijing, China. Along my graduate study, I spent the summer of 2010 in the machine learning group of Microsoft Research Asia, the summer of 2012 in the machine learning group of Microsoft Research redmond, and the fall of 2012 in the advertising science group of Yahoo! Labs.



Update

Mar 2017, the code for Efficient Vector Representation for Documents through Corruption has been posted on github.
Jul 2014, the code for Marginalized Denoising Autoencoders for Nonlinear Representations has been posted here.
Apr 2014, our work on Marginalized Denoising Autoencoders for Nonlinear Representations has been accepted to ICML.
Mar 2014, I have moved to Criteo.
Oct 2013, the code and data for Fast Image Tagging has been posted here.


Research interests

I am interested in machine learning and optimization in general.
My research focuses on two aspects in large scale learning:
  • How to make use of different levels of supervision carried in data to learn?
  • How to develop algorithms that can handle data of such scale at runtime?
Topics: domain adaptation/transfer learning, learning with weak supervision, semi-supervised learning, budgeted learning, large scale sequential learning
Applications:Text mining, Image classification, Ranking, Personalization, Computational biology, Healthcare


Publications

2017

M. Chen
Efficient Vector Representation for Documents through Corruption
5th International Conference on Learning Representations (ICLR), 2017.
[paper][code]
Extended the paper with experiments on the word relationship dataset, showing Doc2VecC generates better word embeddings than Word2Vec.

2015

M. Chen, K. Weinberger, Z. Xu, F. Sha
Marginalized Stacked Denoising Autoencoders
Journal of Machine Learning Research V16 (JMLR), 2015.
[paper]

Z. Chen, M. Chen, K. Weinberger, W. Zhang
Marginalized Denoising for Link Prediction and Multi-Label Learning
29th AAAI Conference on Artificial Intelligence (AAAI), 2015.
.
[paper]

2014

M. Chen, K. Weinberger, F. Sha, Y. Bengio
Marginalized Denoising Autoencoders for Nonlinear Representation
31th International Conference on Machine Learning (ICML), 2014.
[paper][code]

Z. Xu, M. Kusner, K. Weinberger, M. Chen, O. Chapelle
Budgeted Learning with Trees and Cascades
Journal of Machine Learning Research V15 (JMLR), 2014.
[paper][code]

2013

M. Chen, A. Zheng, K. Weinberger
Fast Image Tagging
30th International Conference on Machine Learning (ICML), 2013.
[paper][code][data]

M. Chen
Learning with Single View Co-training and Marginalized Dropout
PhD Thesis dissertation, Washington University in St. Louis, 2013.
[thesis]

L. Maaten, M. Chen, S. Tyree, K. Weinberger
Learning with Marginalized Corrupted Features
30th International Conference on Machine Learning (ICML), 2013.
[paper]

Z. Xu, Matt Kusner, K. Weinberger, M. Chen
Cost Sensitive Trees of Classifiers
30th International Conference on Machine Learning (ICML), 2013.
[paper]

2012

M. Chen, K. Weinberger, A. Zheng
Learning from Incomplete Image Tags
NIPS Workshop on Large Scale Visual Recognition and Retrieval, 2012.
[extended abstract]

Z. Xu, M. Chen, K. Weinberger, F. Sha
From sBoW to dCoT: Marginalized Encoders for Text Representation
21st ACM Conference on Information and Knowledge Management (CIKM), 2012.
[paper][code]

M. Chen, Z. Xu, K. Weinberger, F. Sha
Marginalized Stacked Denoising Autoencoders for Domain Adaptation
29th International Conference on Machine Learning (ICML), 2012.
[paper] [talk] [code]

M. Chen, Z. Xu, K. Weinberger, F. Sha
Marginalized Stacked Denoising Autoencoders
The Learning Workshop, Cliff Lodge, Snowbird, Utah (Snowbird) 2012. (Oral presentation)
[paper] [talk][code]

M. Chen, Z. Xu, K. Weinberger, O. Chapelle, D. Kedem
Classifier Cascade for Minimizing Feature Evaluation Cost
15th International Conference on Artificial Intelligence and Statistics (AISTATS), 2012. (Oral presentation 26/400)
[paper] [talk]

Y. Mao, Y. Chen, G. Hackmann , M. Chen, C. Lu, M. Kollef, and T. Bailey
Early Deterioration Warning for Hospitalized Patients by Mining Clinical Data
International Journal of Knowledge Discovery in Bioinformatics, 2(3):1-20, 2012.
[paper]

2011

M. Chen, K. Weinberger, J. Blitzer
Co-training for Domain Adaptation
25th conference on Neural Information Processing Systems (NIPS), 2011.
[paper][poster][code]

M. Chen, K. Weinberger, O. Chapelle
Classifier Cascade: Tradeoff between Accuracy and Feature Evaluation Cost
6th Annual Workshop for Women in Machine Learning (WiML), 2011 (Oral presentation).
[talk]

M. Chen, J. Sun, X. Ni, Y. Chen
Improving Context-Aware Query Classification via Adaptive Self-training
20th ACM Conference on Information and Knowledge Management (CIKM), 2011.
[paper][talk]

M. Chen, K. Weinberger, Y. Chen
Automatic Feature Decomposition for Single View Co-training
28th International Conference on Machine Learning (ICML), 2011.
[paper] [talk][code]

G. Hackmann , M. Chen, O. Chipara, C. Lu, Y. Chen, M. Kollef, T. Bailey
Toward a Two-Tier Clinical Warning System for Hospitalized Patients
American Medical Informatics Association Annual Symposium (AMIA), 2011.
[paper]

Y. Mao, Y.Chen, G. Hackmann , M. Chen , C. Lu, M. Kollef, T. Bailey
Medical Data Mining for Early Deterioration Warning in General Hospital Wards
ICDM Workshop on Biological Data Mining and its Applications in Healthcare (BioDM), 2011.

2010

Y. Chen, M. Chen
Extended Duality for Nonlinear Programming
Computational Optimization and Applications, 47(1): 33-59, 2010.
[paper]

M. Chen, J. Sun, X. Ni
Adaptive Self-Training with Max Margin Conditional Random Fields for Context Aware Query Classification
5th Annual Workshop for Women in Machine Learning (WiML), 2010.

2009

M. Chen, Y. Chen, M. Brent, A. Tenney
Constrained Optimization for Validation-Guided Conditional Random Field Learning
15th ACM Conference on Knowledge Discovery and Data Mining (KDD), 2009. (Finalist for Best Paper Award)
[paper]

M. Chen
To Improve the Speed and Generalization Ability of Conditional Random Fields
Master's Thesis, Washington University in Saint Louis, 2009

M. Chen, Y. Chen, M. Brent, A. Tenney
Gradient-Based Feature Selection for Conditional Random Fields and Its Applications in Computational Genetics
21st IEEE International Conference on Tools with Artificial Intelligence (ICTAI), 2009.
[paper]

2008

M. Chen, Y. Chen, M. Brent
CRF-OPT: An Efficient High-Quality Conditional Random Field Solver
23rd AAAI Conference on Artificial Intelligence (AAAI), 2008.
[paper]



Software

marginalized Stacked Denoising Autoencoder (mSDA)
The code for marginalized Stacked Denoising Autoencoder, an instance feature learning algorithm which preserves the strong feature learning capacity of Stacked Denoising Autoencoders, but is orders of magnitudes faster. (The code uses LIBSVM from http://www.csie.ntu.edu.tw/~cjlin/libsvm/) The Amazon review dataset (Blitzer et. al. 2006) we used for applying mSDA to domain adaptation is included.

Co-training for Domain Adaptation (CODA)
The code for Co-training for Domain Adaptation, which extends PMC to domain adaptation tasks with an additional feature selection component. (The code uses minimize.m from http://www.gatsby.ucl.ac.uk/~edward/code/minimize/minimize.m )

Pseudo Multi-view Co-training (PMC)
The code for Pseudo Multi-view Co-training, which extends co-training to learning scenarios without an explicit multi-view representation. (The code uses minimize.m from http://www.gatsby.ucl.ac.uk/~edward/code/minimize/minimize.m )

Conditional Random Fields (CRF-OPT)
CRF-OPT is a general-purpose high-performance Conditional Random Field (CRF) optimization package. Built on top of the Toolkits for Advanced Optimization developed by the Argonne National Laboratory, CRF-OPT offers an efficient high-precision solver for large-scale CRF learning. Key features in CRF-OPT include a preprocessing algorithm for reducing the complexity of function and gradient evaluations, and an automated exponential transformation for addressing a premature termination syndrome generally existed when gradient-based search algorithms are applied to CRF optimization. CRF-OPT accepts a uniform interface for specifying the training sequences, testing sequences, and features. Please go through the README and EXAMPLES sections for details. ( Examples   Readme  )