During Spring and Summer 2017, the neural networks seminar met Tuesdays at 2:00 p.m. in Mudd 417. Each week we would discuss one or two contemporary works concerning optimization, geometry, and symmetry in the training of neural networks and related learning problems.

This page collects the papers that have been discussed in the seminar so far, in reverse chronological order.

Past Discussion Topics

  • August 1, 2017
    • Ashish Bora, Ajil Jalal, Eric Price, and Alexandros G. Dimakis, Compressed Sensing using Generative Models. ICML 2017. link
  • July 25, 2017
    • Sanjeev Arora, Rong Ge, Yingyu Liang, Tengyu Ma, and Yi Zhang, Generalization and Equilibrium in Generative Adversarial Nets (GANs). ICML 2017. link
    • Sanjeev Arora and Yi Zhang, Do GANs Actually Learn the Distribution? An Empirical Study. Preprint, 2017. link
  • July 19, 2017
    • Mark Borgerding, Philip Schniter, and Sundeep Rangan, AMP-Inspired Deep Networks for Sparse Linear Inverse Problems. IEEE Trans. Sig. Proc. 65(16), 2017. link
  • July 11, 2017
    • Martin Arjovsky, Soumith Chintala, and Léon Bottou, Wasserstein GAN. Preprint, 2017. link
  • June 27, 2017
    • Sachin Ravi and Hugo Larochelle, Optimization as a Model for Few-Shot Learning. ICLR 2017. link
    • Marcin Andrychowicz, Misha Denil, Sergio Gomez, Matthew Hoffman, David Pfau, Tom Schaul, Brendan Shillingford, and Nando de Freitas, Learning to Learn by Gradient Descent by Gradient Descent. NIPS 2016. link
  • June 20, 2017
    • Gintare Karolina Dziugaite and Daniel Roy, Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data. Preprint, 2017. link
  • June 13, 2017
    • Behnam Neyshabur, Ryota Tomioka, Ruslan Salakhutdinov, and Nathan Srebro, Geometry of Optimization and Implicit Regularization in Deep Learning. Preprint, 2017. link
    • Pratik Chaudhari, Anna Choromanska, Stefano Soatto, Yann LeCun, Carlo Baldassi, Christian Borgs, Jennifer Chayes, Levent Sagun, and Riccardo Zecchina, Entropy-SGD: Biasing Gradient Descent Into Wide Valleys. ICLR 2017. link
  • June 6, 2017
    • Ashia Wilson, Rebecca Roelofs, Mitchell Stern, Nathan Srebro, and Benjamin Recht, The Marginal Value of Adaptive Gradient Methods in Machine Learning. Preprint, 2017. link
    • John Duchi, Elad Hazan, and Yoram Singer, Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. JMLR (12), 2011. link
    • Diederik Kingma and Jimmy Ba, Adam: A Method for Stochastic Optimization. ICLR 2015, revised 2017. link
  • May 30, 2017
    • Shai Shalev-Shwartz, Ohad Shamir, and Shaked Shammah, Failures of Gradient-Based Deep Learning. Preprint, 2017. link
  • May 2, 2017
    • Moritz Hardt, Benjamin Recht, and Yoram Singer, Train Faster, Generalize Better: Stability of Stochastic Gradient Descent. ICML 2016. link
  • April 25, 2017
    • Ian Goodfellow, Oriol Vinyals, and Andrew Saxe, Qualitatively Characterizing Neural Network Optimization Problems. ICLR 2015. link
    • C. Daniel Freeman and Joan Bruna, Topology and Geometry of Half-Rectified Network Optimization. ICLR 2017. link
  • April 18, 2017
    • Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals, Understanding Deep Learning Requires Rethinking Generalization. ICLR 2017. link
  • April 14, 2017
    • Nitish Shirish Keskar, Dheevatsa Mudigere, Jorge Nocedal, Mikhail Smelyanskiy, and Ping Tak Peter Tang, On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima. ICLR 2017. link
    • Laurent Dinh, Razvan Pascanu, Samy Bengio, and Yoshua Bengio, Sharp Minima Can Generalize For Deep Nets. ICML 2017. link
  • April 7, 2017
    • Haohan Wang and Bhiksha Raj, On the Origin of Deep Learning. Preprint, 2017. link
    • Jürgen Schmidhuber, Deep Learning in Neural Networks: An Overview. Neural Networks (61), 2015. link