Learning to Correct Form and Function with Reinforcement
All biological functions of life are determined by molecular forms. These dynamical events are
sensed, directed and modulated by small and large molecules, often in clusters, reconfiguring to
assume optimized form for targeted function. In this talk I shall explain how we mathematically
interpret such lessons from nature, and adaptively train deep generative networks, to stably learn
to correct form for optimized function, in a number of dynamical scenarios.
We exploit the tight connections between discrete and continuous controllable dynamical systems and
reinforcement learning. These characterizations are crucial for deriving feasibility and guaranteed
convergence (stability), while accelerating the reinforcement learning process .
Chandrajit Bajaj is the director of the Center for Computational Visualization, at the Oden
Institute for Computational and Engineering Sciences and a Professor of Computer Sciences at the
University of Texas at Austin. Bajaj holds the Computational Applied Mathematics Chair in
Visualization. He is also an affiliate faculty member of Mathematics, Computational Neuroscience and
Electrical Engineering. He is currently on the editorial boards for the International Journal of
Computational Geometry and Applications, and the ACM Computing Surveys, and past editorial member of
the SIAM Journal on Imaging Sciences. He was awarded a distinguished alumnus award from the Indian
Institute of Technology, Delhi, (IIT, Delhi). He is also a Fellow of The American Association for
the Advancement of Science (AAAS), Fellow of the Association for Computing Machinery (ACM), Fellow
of the Institute of Electrical and Electronic Engineers (IEEE), and Fellow of the Society of
Industrial and Applied Mathematics (SIAM). He has won the University of Texas Faculty research
award, the Dean Research Assignment award,
and also thrice won the University of Texas, Institute of Computational
Engineering and Sciences, Moncreif Grand Challenge research award.
Uniformly distributed subsets of a metric space are important in a wide range of applied problems.
Recently there has been renewed attention to the problem of constructing finite subsets of a real
sphere that optimize the deviation from the uniform distribution. In this talk we address this range
of questions for finite metric spaces with a focus on the Hamming space. In the first part of the
talk we connect quadratic discrepancy of a subset with a distribution of distances in it, proving a
version of Stolarsky's invariant principle for finite metric spaces. Then we derive several bounds
on quadratic discrepancy for the Hamming space and identify codes that are discrepancy minimizers.
Finally, we derive bounds on Lp discrepancies of codes and mention some other extensions of the
Alexander Barg is a Professor in the Departments of Electrical Engineering, Computer Science, and
Mathematics of the University of Maryland, College Park, MD. His research interests are in
information theory and related areas of applied mathematics.
Despite impressive performance, most deep learning models used in AI applications still suffer from
a lack of robustness against adversarial attacks. I will describe a new framework for deep learning
called implicit deep learning, which generalizes the standard recursive rules of feedforward neural
networks. These models are based on the solution of a fixed-point equation involving a single a
vector of hidden features, which is thus only implicitly defined. The new framework greatly
simplifies the notation of deep learning, and opens up many new possibilities, in terms of novel
architectures and algorithms, and robustness analysis and design.
Laurent El Ghaoui graduated from Ecole Polytechnique (Palaiseau, France) in 1985, and obtained PhD
Aeronautics and Astronautics at Stanford University in March 1990. He was a faculty member of the
Ecole Nationale Supérieure de Techniques Avancées (Paris, France) from 1992 until 1999, and held
part-time teaching appointments at Ecole Polytechnique within the Applied Mathematics department and
Université de Paris-I (La Sorbonne) in the Mathematics in Economy program. II joined the Berkeley
faculty in April 1999 as an Acting Associate Professor, and obtained his tenure in May 2001. He went
on leave from UC since July 2003 until September 2006 to work for SAC Capital Management, a hedge
fund based in New York and Connecticut. Since then he has been back full time at UC Berkeley in the
EECS department. He teaches optimization in that department (EE 127 / EE 227AT and EE 227BT), and a
class on Optimization Models within the Masters of Financial Engineering at the Haas School of
In 2017 he has co-founded a company, SumUp Analytics, which provides high-speed streaming text
analytics for business applications.
He was the recipient of a Bronze Medal for Engineering Sciences, from the Centre National de la
Recherche Scientifique (France), a CAREER award, an Okawa research grant, and a Google research
grant. He is also the co-recipient of a SIAM optimization prize.
Beyond Trans-dimensional Sampling: Generalised Bayesian Model Selection
Most model estimation methods of signal processing and machine learning make the assumption that the
model type and dimension is known beforehand and the estimation problem reduces to estimating the
values of the parameters of the known model of known dimension. This is not a realistic assumption
in various applications though. We generally do not know in advance how many clusters can be in our
data, or how many sources there are in a speech signal mixture in the cocktail party problem. We
would not know how many targets there are in a radar signal. The Bayesian Monte Carlo method, namely
Reversible Jump Markov Chain Monte Carlo (RJMCMC) provides a solution to the estimation of model
dimension. However, a more general problem of estimating the type of model remains. For example, in
a channel estimation problem we generally do not know the distribution of noise, or in a system
identification problem we do not know beforehand if the system is linear or nonlinear. In this talk,
we will present a greater picture of model selection problem and will extend the RJMCMC algorithm to
a trans-class sampling algorithm which is capable of choosing among different generic models
automatically. We will demonstrate the success of the method on problems such as Volterra system
identification, PLC noise estimation, speckle classification in Synthetic Aperture Radar Images and
in wavelet domain modelling of natural images.
Ercan E. Kuruoğlu received MPhil and PhD degrees in information engineering from the University of
Cambridge, Cambridge, United Kingdom, in 1995 and 1998, respectively. In 1998, he joined Xerox
Research Center Europe, Cambridge. He was an ERCIM fellow in 2000 with INRIA-Sophia Antipolis,
France. In January 2002, he joined ISTI-CNR, Pisa, Italy. He was a visiting professor with Georgia
Tech-China in 2007, 2011 and 2016, Southern University of Science and Technology of China, Shenzhen
in 2017 and Zhejiang Gongshang University in 2019. He is currently a Visiting Professor at
Tsinghua-Berkeley Shenzhen Institute, on leave from his Senior Researcher position at Institute of
Science and Technology of Information-CNR (Italian National Council of Research). He was an
associate editor for the IEEE Transactions on Signal Processing and IEEE Transactions on Image
Processing. He is currently the editor in chief of Digital Signal Processing: A Review Journal. He
acted as a technical co-chair for EUSIPCO 2006 and a tutorials co-chair of ICASSP 2014. He is a
member of the IEEE Technical Committees on Signal Processing Theory and Methods, Machine Learning
for Signal Processing and Image, Vision and Multidimensional Signal Processing. He was a plenary
speaker at DAC 2007, ISSPA 2010, IEEE SIU 2017, Entropy 2018, TBSI-WODS 2019 and tutorial speaker at
IEEE ICSPCC 2012. He was a Chinese Government 111 Project Foreign Expert 2007-2011. He was an
Alexander von Humboldt Experienced Research Fellow in the Max Planck Institute for Molecular
Genetics in 2013-2015. His research interests are in the areas of statistical signal and image
processing and information and coding theory with applications in computational biology, remote
sensing, telecommunications, earth sciences and astrophysics.
It has been demonstrated recently that deep learning (DL) has great potentials to break the
bottleneck of the conventional communication systems. In this talk, we present our recent work in DL
in wireless communications, including physical layer processing and resource allocation. DL can
improve the performance of each individual (traditional) block in a conventional communication
system or jointly optimize the whole transmitter or receiver. Therefore, we can categorize the
applications of DL in physical layer communications into with and without block processing
structures. For DL based communication systems with block structures, we present joint channel
estimation and signal detection based on a fully connected deep neural network, model-drive DL for
signal detection. For those without block structures, we provide our recent endeavors in developing
end-to-end learning communication systems with the help of deep reinforcement learning (DRL) and
generative adversarial net (GAN).
Judicious resource (spectrum, power, etc.) allocation can significantly improve efficiency of
wireless networks. The traditional wisdom is to explicitly formulate resource allocation as an
optimization problem and then exploit mathematical programming to solve it to a certain level of
optimality. Deep learning represents a promising alternative due to its remarkable power to leverage
data for problem solving and can help solve optimization problems for resource allocation or can be
directly used for resource allocation. We will first present our research results in using deep
learning to reduce the complexity of mixed integer non-linear programming (MINLP). We will then
discuss how to use deep reinforcement learning directly for wireless resource allocation with
application in vehicular networks.
Dr. Geoffrey Li is currently a Professor with the School of Electrical and Computer Engineering at
Georgia Institute of Technology. He was with AT&T Labs – Research for five years before joining
Georgia Tech in 2000. His general research interests include statistical signal processing and
machine learning for wireless communications. In these areas, he has published over 500 referred
journal and conference papers in addition to over 40 granted patents. His publications have been
cited over 41,000 times and he has been listed as the World’s Most Influential Scientific Mind, also
known as a Highly-Cited Researcher, by Thomson Reuters almost every year since 2001. He has been an
IEEE Fellow since 2006. He received 2010 IEEE ComSoc Stephen O. Rice Prize Paper Award, 2013 IEEE
VTS James Evans Avant Garde Award, 2014 IEEE VTS Jack Neubauer Memorial Award, 2017 IEEE ComSoc
Award for Advances in Communication, 2017 IEEE SPS Donald G. Fink Overview Paper Award, and 2019
IEEE ComSoc Edwin Howard Armstrong Achievement Award. He also won the 2015 Distinguished Faculty
Achievement Award from the School of Electrical and Computer Engineering, Georgia Tech.
Learning Diverse and Discriminative Representations via the Principle of Maximal Coding Rate
To learn intrinsic low-dimensional structures from high-dimensional data that most discriminate
between classes, we propose the principle of Maximal Coding Rate Reduction (MCR^2), an
information-theoretic measure that maximizes the coding rate difference between the whole dataset
and the sum of each individual class. We clarify its relationships with most existing frameworks
such as cross-entropy, information bottleneck, information gain, contractive and contrastive
learning, and provide theoretical guarantees for learning diverse and discriminative features. The
coding rate can be accurately computed from finite samples of degenerate subspace-like distributions
and can learn intrinsic representations in supervised, self-supervised, and unsupervised settings in
a unified manner. Empirically, the representations learned using this principle alone are
significantly more robust to label corruptions in classification than those using cross-entropy, and
can lead to state-of-the-art results in clustering mixed data from self-learned invariant features
Yi Ma is currently a professor in residence at the EECS Department of UC Berkeley. He has been a
professor and the executive dean of the School of Information and Science and Technology,
ShanghaiTech University, China from 2014 to 2017. From 2009 to early 2014, he was a Principal
Researcher and the Research Manager of the Visual Computing group at Microsoft Research in Beijing.
From 2000 to 2011, he was an assistant and associate professor at the Electrical & Computer
Engineering Department of the University of Illinois at Urbana-Champaign. His main research interest
is in computer vision, data science, and systems theory. Yi Ma received his Bachelors’ degree in
Automation and Applied Mathematics from Tsinghua University (Beijing, China) in 1995, a Master of
Science degree in EECS in 1997, a Master of Arts degree in Mathematics in 2000, and a PhD degree in
EECS in 2000, all from the University of California at Berkeley. Yi Ma received the David Marr Best
Paper Prize at the International Conference on Computer Vision 1999, the Longuet-Higgins Best Paper
Prize (honorable mention) at the European Conference on Computer Vision 2004, and the Sang Uk Lee
Best Student Paper Award with his students at the Asian Conference on Computer Vision in 2009. He
also received the CAREER Award from the National Science Foundation in 2004 and the Young
Investigator Award from the Office of Naval Research in 2005. He has written two textbooks: “An
Invitation to 3-D Vision” published in 2004, and “Generalized Principal Component Analysis”
published in 2016, all by the Springer. He was an associate editor of IEEE Transactions on Pattern
Analysis and Machine Intelligence (TPAMI), the International Journal of Computer Vision (IJCV), SIAM
journal on Imaging Sciences, IEEE Signal Processing Magazine, and IEEE transactions on Information
Theory (TIT). He is currently founding associate editors of the IMA journal on Information and
Inference and SIAM journal on Mathematics of Data Science. He has served as Area Chairs for ICCV,
CVPR, and NIPS, the Program Chair for ICCV 2013, and the General Chair for ICCV 2015. He is a Fellow
of both IEEE and ACM. He is ranked the World's Highly Cited Researchers of 2016 by Clarivate
Analytics of Thomson Reuters and is among Top 50 of the Most Influential Authors in Computer Science
of the World, ranked by Semantic Scholar, reported by Science Magazine, April 2016.
Dynamic Latent Variable Analytics for Anomaly Detection and Monitoring
In this talk we provide a new perspective on process data analytics towards Industry 4.0 based on
latent variable modeling. We are concerned with data science and analytics as applied to data from
dynamic systems and processes for the purpose of monitoring, prediction, and inference. Collinearity
is inevitable in operational data. Therefore, we focus on latent variable methods that achieve
dimension reduction and collinearity removal. We present a new dimension reduction expression of
state space framework for dynamic latent variable analytics, including dynamic-inner principal
component analysis and dynamic-inner canonical correlation analysis. They are introduced to model
high dimensional time series data to extract the most dynamic latent features. We show with an
industrial case how real process data are efficiently modeled using these analytics to extract
dynamic features, illustrating the point that dynamic feature extraction from process data are
indispensable for process troubleshooting, visualization, diagnosis, and improvement.
Dr. S. Joe Qin obtained his B.S. and M.S. degrees in Automatic Control from Tsinghua University in
Beijing, China, in 1984 and 1987, respectively, and his Ph.D. degree in Chemical Engineering from
University of Maryland at College Park in 1992. He is currently Chair Professor, Dean of the School
of Data Science, and Director of Hong Kong Institute for Data Science at City University of Hong
Kong. In his prior career he worked as the Fluor Professor at the Viterbi School of Engineering of
the University of Southern California, Professor at the University of Texas at Austin, and Principal
Engineer at Emerson Process Management for 28 years cumulatively.
Dr. Qin is a Fellow of the International Federation of Automatic Control (IFAC)， AIChE, and IEEE. He
is a recipient of the U.S. National Science Foundation CAREER Award, the 2011 Northrop Grumman Best
Teaching award at Viterbi School of Engineering, the DuPont Young Professor Award, Halliburton/Brown
& Root Young Faculty Excellence Award, NSF-China Outstanding Young Investigator Award, and recipient
of the IFAC Best Paper Prize for a model predictive control paper published in Control Engineering
Practice. He has served as Senior Editor of Journal of Process Control, Editor of Control
Engineering Practice, Member of the Editorial Board for Journal of Chemometrics, and Associate
Editor for several journals. He has published over 400 international journal papers, book chapters
conference papers and presentations. He received over 14,000 Web of Science citations with an
h-index of 57, over 18,000 Scopus citations with an h-index of 62, and 30,000 Google Scholar
citations with an h-index of 74. Dr. Qin’s research interests include data analytics, machine
learning, process monitoring, fault diagnosis, model predictive control, system identification,
smart manufacturing, and predictive maintenance.
Decomposition Multiobjective Optimization and Pareto Multitask Learning
Many real-world optimization problems are multiobjective by nature. Multiobjective evolutionary
algorithms are a widely used algorithmic framework for solving multiobjective optimization problems.
In this talk, I will briefly explain the basic ideas behind decomposition based multiobjective
evolutionary algorithm (MOEA/D). Multitask learning can be naturally modelled as a multiobjective
optimization problem. I will introduce a recent application of MOEA/D on multitask learning.
Qingfu Zhang is a Chair Professor of Computational Intelligence at the Department of Computer
Science, City University of Hong Kong. His main research interests include evolutionary computation,
optimization, neural networks, data analysis, and their applications. His decomposition based
multiobjective evolutionary algorithm (MOEA/D) has been one of the two most widely algorithms in the
field of evolutionary computation. He is a Web of Science highly cited researcher in Computer
Science for four consecutive years from 2016. He is an IEEE fellow.
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction
Asynchronous Q-learning aims to learn the optimal action-value function (or Q-function) of a Markov
decision process (MDP), based on a single trajectory of Markovian samples generated by following a
behavior policy. Focusing on a
-discounted MDP with state space S and action space A, we
demonstrate that the sample complexity of classical asynchronous Q-learning --- namely, the number
of samples needed to yield an
-accurate Q-function estimate in an entrywise sense --- is
at most on the order of
up to some logarithmic factor, provided that a proper constant learning
rate is adopted. Here,
denote respectively the mixing time
and the minimum state-action occupancy probability of the sample trajectory. This sample complexity
bound improves upon the state-of-the-art result by a factor of at least |S||A|. Our result confirms
that if the mixing time is not too large, then the convergence of asynchronous Q-learning resembles
the synchronous case with independent samples. Further, the scaling on the discount complexity can
be improved by means of variance reduction.
This is joint work with Gen Li, Yuting Wei, Yuantao Gu, and Yuejie Chi.
Yuxin Chen is currently an assistant professor in the Department of Electrical Engineering at
Princeton University. Prior to joining Princeton, he was a postdoctoral scholar in the Department of
Statistics at Stanford University, and he completed his Ph.D. in Electrical Engineering at Stanford
University. His research interests include high-dimensional statistics, convex and nonconvex
optimization, statistical learning, information theory, and reinforcement learning. He received the
2019 AFOSR Young Investigator Award, the 2020 ARO Young Investigator Award, and the 2020 Princeton
Graduate Mentoring Award. He has also been selected as a finalist for the 2019 Best Paper Prize for
Young Researchers in Continuous Optimization.
Crowdsourced Classification with XOR Queries: Fundamental Limits and an Efficient Algorithm
Crowdsourcing systems have emerged as an effective platform to label data and classify objects with
relatively low cost by exploiting non-expert workers.
To ensure reliable recovery of unknown labels with as few number of queries as possible, we consider
an effective query type that asks "group attribute'' of a chosen subset of objects. In particular,
we consider the problem of classifying $m$ binary labels with XOR queries that ask whether the
number of objects having a given attribute in the chosen subset of size $d$ is even or odd. The
subset size $d$, which we call query degree, can be varying over queries. Since a worker needs to
make more efforts to answer a query of a higher degree, we consider a noise model where the accuracy
of worker's answer changes depending both on the worker reliability and query degree $d$. For this
general model, we characterize the information-theoretic limit on the optimal number of queries to
reliably recover $m$ labels in terms of a given combination of degree-$d$ queries and noise
parameters. Further, we propose an efficient inference algorithm that achieves this limit even when
the noise parameters are unknown.
Hye Won Chung is an assistant professor in the School of Electrical Engineering at KAIST. Her
research interests include statistical inference, information theory, data science, machine
learning, and quantum information. She received the B.S. degree (with summa cum laude) from KAIST in
Korea and the M.S. and Ph.D. degrees from MIT, all in Electrical Engineering and Computer Science,
in 2007, 2009 and 2014, respectively. From 2014 to 2017, she worked as a Research Fellow in the
Department of Electrical Engineering and Computer Science at the University of Michigan.
Industrial Internet has been assigned as a new network infrastructure in China. Many top industrial
manufacturing solution providers are now competing for the industrial internet.
In this talk, I will give a comprehensive overview of industrial internet including its concept,
scope, core technologies, the current status and the developing trend. The typical scenarios
motivated by industrial internet solutions will be detailed finally.
Weixi Gu is the leader of the industrial intelligence research team in China Academy of Industrial
Internet. His research interests including industrial internet of things, machine learning, mobile
Previously, Weixi Gu was a postdoc fellow in UC Berkeley, he received his Ph.D, Master degree in
Tsinghua University in 2018 and 2015. He received the Bachelor degree from Shanghai Jiaotong
University in 2012.
Weixi Gu has received several awards including the Best Paper Award in
Mobiquitous 2016, the Best Paper Award in Trustcom 2014, and National Scholarship
at 2014, 2016,2017.
Nowadays, 3D point clouds can be easily acquired and naturally used to express any objects or scenes
in the real world with various scales and rich attributes. We could thus calculate and restore the
entire world from such points, so one may say "everything is a 3D point cloud or point clouds are
everywhere:-)" In this talk, I will review a serial of our research work done on points learning
over the past decade, highlighting in-depth on the acquisition, consolidation, representation and
reconstruction techniques, and wrap up with some thoughts on potential explorations in the near
Hui HUANG is a Distinguished Professor of Shenzhen University, where she directs the Visual
Computing Research Center. She received her PhD degree in Applied Math from the University of
British Columbia in 2008 and another PhD degree in Computational Math from Wuhan University in 2006.
Her research interests span on Computer Graphics and 3D Vision, focusing on Geometry Modeling, Shape
Analysis, Points Learning, Image Processing, 3D/4D Acquisition and Creation. She is currently an
Associate Editor-in-Chief of The Visual Computer and is on the editorial board of ACM Transactions
on Graphics and Computers & Graphics. She has served on the program committees of all major computer
graphics conferences, including SIGGRAPH, SIGGRAPH ASIA, EG, SGP, PG, 3DV, CGI, GMP, SMI, GI and
CAD/Graphics, etc. She is invited to be SMI 2020 Conference Chair, CAD&CG 2020 Program Chair, SGP
2019 Program Chair, CHINAGRAPH 2018 Program Vice-Chair, in addition to SIGGRAPH ASIA 2017 Technical
Briefs and Posters Co-Chair, SIGGRAPH ASIA 2016 Workshops Chair and SIGGRAPH ASIA 2014 Community
Liaison Chair. She is the recipient of National Excellent Young Scientist Fund and GD Outstanding
Talent Award. She is also selected as CCF Distinguished Member and ACM/IEEE/CGIS Senior Member.
Efficient Computing Platform Design for Autonomous Driving Systems
Massive sensor data can be generated on an autonomous vehicle and have to be processed in time using
complex models, including deep neural networks, which require a computing platform with high
computing ability. However, hardware resource is often restricted due to price cost and power
consumption concerns. In this talk, I will introduce how we improve the computing efficiency on
heterogeneous chips with customized design, and how we design the network architectures with
hardware-aware optimizations. I will also deliver how Novauto implements its computing platforms to
support autonomous driving systems for L4 level scenarios.
Shuang Liang is currently the CTO of Novauto, China. He received his Ph.D. and B.S. degree from the
Institute of Microelectronics, Tsinghua University, Beijing, China, in 2018 and 2011, respectively.
He was a visiting scholar at the Department of Computing, Imperial College London, UK, in 2016. His
interests include reconfigurable computing, hardware acceleration of machine learning algorithms and
Principal Researcher of AI Department
Challenges in Real-world Applications of Federated Learning: Efficiency and Security
In this talk, I will discuss some of the key concepts and challenges in an emerging field called
federated learning, with an focus on real-world industrial applications.
Dr. Yang Liu is a Principal Researcher in the AI Department of WeBank, China. Her research interests
include machine learning, federated learning, transfer learning, multi-agent systems, statistical
mechanics, and applications of these technologies in the financial industry. She received her PhD
from Princeton University in 2012 and her Bachelor's degree from Tsinghua University in 2007. She
holds multiple patents. Her research has been published in leading scientific conferences and
journals such as IJCAI, ACM TIST and Nature. She co-authored the book "Federated Learning" - the
first monograph on the topic of federated learning. Her research work has been recognized with
multiple awards, such as CCF Technology Award, AAAI Innovation Award and IJCAI Innovation Award.
University of Massachusetts Amherst
Learning Mixtures and Trace Reconstruction
We present a complex-analytic method of learning mixture of distributions and apply it to learn
Gaussian mixtures with shared variance, binomial mixtures with shared success probability, and
Poisson mixtures, among others. The complex analytic method was introduced to reconstruct a sequence
from their random subsequences, which is called the trace reconstruction problem. We show some new
results in trace reconstruction and mention some potential extension of the complex analytic method
in learning mixtures.
Arya Mazumdar is an Associate Professor of Computer Science with additional affiliations to the
Department of Mathematics and Statistics and the Center for Data Science at the University of
Massachusetts Amherst (UMass). He is also a researcher in Amazon AI and Search. In the past, he was
a faculty member at University of Minnesota-Twin Cities (2013--15), and a postdoctoral scholar at
Massachusetts Institute of Technology (2011--12). Arya received his Ph.D. from the University of
Maryland College Park, where his thesis won a Distinguished Dissertation Award (2011). Arya is a
recipient of multiple other awards, including an NSF CAREER award (2015), an EURASIP Best Paper
Award (2020) and an ISIT Jack K. Wolf Paper Award (2010). He is currently serving as an Associate
Editor for the IEEE Transactions on Information Theory. Arya's research interests include coding
theory (error-correcting codes and related combinatorics), information theory and foundations of
Learning Low-complexity Models from the Data – Geometry, Optimization, and Applications
Today we are collecting a massive amount of data in forms of images and videos, that we want to
learn from the data themselves to extract useful information and to make predictions. The data are
high-dimensional, but often possess certain low-dimensional structures (e.g., sparsity). However,
learning these low-complexity models often results in highly nonconvex optimization problems, where
in the past our understandings of solving them were very limited. In the worst case, optimizing a
nonconvex problem is NP-hard.
In this talk, we present global nonconvex optimization theory and guaranteed algorithms for
efficient learning of low-complexity models from high-dimensional data. For several important
problems in imaging science (i.e., sparse blind deconvolution) and representation learning (i.e.,
convolutional/overcomplete dictionary learning), we show that the underlying symmetry and
low-complexity structures avoid the worst-case scenarios, leading to benign global geometric
properties of the nonconvex optimization landscapes. In particular, for sparse blind deconvolution
that aims to jointly learn the underlying physical model and sparse signals from convolutions, the
geometric intuitions lead to efficient nonconvex algorithms, with linear convergence to target
solutions. Moreover, we extended our geometric analysis to convolutional dictionary learning based
on its similarity with overcomplete dictionary learning, providing the first global algorithmic
guarantees for both problems. Finally, we demonstrate our methods on several important applications
in scientific discovery, and draw connections to learning deep neural networks.
This talk is mainly based on one paper appeared in NeurIPS’19 (spotlight), and two papers accepted
by ICLR’20 (one oral).
Qing Qu is a Moore-Sloan data science fellow at the Center for Data Science, New York University. He
received his Ph.D. from Columbia University in Electrical Engineering in Oct. 2018. He received his
B.Eng. from Tsinghua University in Jul. 2011, and an M.Sc.from the Johns Hopkins University in Dec.
2012, both in Electrical and Computer Engineering. He interned at U.S. Army Research Laboratory in
2012 and Microsoft Research in 2016, respectively. His research interest lies at the intersection of
the foundation of data science, machine learning, numerical optimization, and signal/image
processing. His research focuses on developing computational methods for learning low-complexity
models/structures from high dimensional data, leveraging tools from machine learning, numerical
optimization, and high dimensional probability/geometry. He is also interested in applying these
data-driven methods to various engineering problems in imaging sciences, scientific discovery, and
healthcare. He is the recipient of Best Student Paper Award at SPARS’15 (with Ju Sun and John
Wright), and the recipient of Microsoft Ph.D. Fellowship 2016-2018 in machine learning.
Combinatorial list-decoding of Reed-Solomon codes beyond the Johnson radius
List-decoding of Reed-Solomon (RS) codes beyond the so-called Johnson radius has been one of the
main open questions since the work of Guruswami and Sudan. It is now known by the work of Rudra and
Wootters, using techniques from high dimensional probability, that over large enough alphabets most
RS codes are indeed list-decodable beyond this radius.
In this talk, we take a more combinatorial approach which allows us to determine the precise
relation (up to the exact constant) between the decoding radius and the list size. We prove a
generalized Singleton bound for a given list size, and conjecture that the bound is tight for most
RS codes over large enough finite fields. We also show that the conjecture holds for list sizes 2
and 3, and as a by-product show that most RS codes with a rate of at least 1/9 are list-decodable
beyond the Johnson radius. Lastly, we give the first explicit construction of such RS codes. The
main tools used in the proof are a new type of linear dependency between codewords of a code that
are contained in a small Hamming ball, and the notion of cycle space from Graph Theory. Both of them
have not been used before in the context of list-decoding.
Joint work with Dr. Chong Shangguan.
Itzhak Tamo is an Assistant Professor at the Department of Electrical Engineering at the Systems
Tel-Aviv University, Israel.
He received his PhD in 2013 from Ben-Gurion University. Before joining Tel Aviv-University,
he was a post-doc at the University of Maryland in the ECE/ISR department. His research interests
include information networks, coding theory, combinatorics and information theory in biology. He
has numerous publications at the IEEE International Symposium On Information Theory Conferences
and in the IEEE Transactions on Information Theory journal. He has been awarded the Krill Prize for
Excellence in Research in 2018, the 2015 IEEE Information Theory Society Paper Award and the 2013
IEEE Communication Society Data Storage Technical Committee Best Paper Award.
Title: To Prove or to Disprove: Automated Reasoning by Convex Optimization
Hilbert's 24th problem was about a criterion for the simplicity of mathematical proofs, but there
can be a variety of plausible criterion for proof simplicity, each leading to a different way to
search for proofs. The right kind of proof simplicity criterion may even enable computers to prove
theorems in the field of artificial intelligence. The premise of this talk is automated reasoning by
convex optimization in which optimization-theoretic tools, when viewed in the context of interactive
theorem proving, can automate the task of generating insights and reasoning by computers.
Optimization-theoretic notions such as duality and recent advances in convex relaxation and
regularization methods for sparsity constraints can be exploited to automate proof search in
large-scale problems, pushing the limits of knowledge-discovery via mathematical optimization and
hardware acceleration. We discuss its application to proving or disproving linear inequalities in
information theory via a cloud-based AITIP Software-as-a-Service, and present some open issues in
Chee Wei Tan received the B.S. (First Class Honors) from the National University of Singapore and
the M.A. and Ph.D. degrees from Princeton University, Princeton, NJ. He is an Associate Professor of
Computer Science. His research interests are artificial intelligence, networks and graph analytics,
convex optimization and AI technologies for learning at scale.
He was a Visiting Faculty at Tencent AI Lab, a Senior Fellow of the Institute for Pure & Applied
Mathematics (IPAM) for the program on Science at Extreme Scales. He was a Postdoctoral Scholar at
the California Institute of Technology, and affiliated with the Netlab Group at Caltech.
He is currently serving as Editor for the IEEE/ACM Transactions on Networking. He previously served
as Editor for the IEEE Transactions on Communications (2012-2018), the Chair of the IEEE Information
Theory Society Hong Kong Chapter when he received the 2015 IT Society Chapter of the Year Award. Dr.
Tan was the recipient of the Princeton University Wu Prize for Excellence, Google Faculty Award,
IEEE ComSoc Researcher Award, and was twice selected for the U.S. National Academy of Engineering
China-America Frontiers of Engineering Symposium. He is Director of Education Bureau's Programme on
Creative Mathematics and Computer Science for Gifted Students and the Computer Science Challenge
(eSports game tournament) for high school students to learn computer science and advanced
Recent Advances in Black-box Adversarial Attacks to Deep Learning
In this talk, I will introduce recent advances in black-box adversarial attacks to deep learning
models. Since the black-box adversarial attack only requires the model’s output, rather than the
model parameter, it could pose a substantial threat to deep learning systems in real-world
scenarios. We will firstly give a general review of the literature of black-box adversarial attacks.
Then, we will introduce two of our recent works. One is for the decision-based black-box attack,
which utilizes the historical queries to accelerate the searching process, and it is the first time
in the world to successfully fool the face recognition API in the decision-based setting. The other
is for the score-based black-box attack, which proposes to capture the probability distribution of
adversarial perturbations by the conditional Glow model, such that it is very efficient to
successfully sample adversarial perturbations with a few queries. Finally, I will share some
thoughts about the trends of this topic.
Dr. Baoyuan Wu is currently a Principal Researcher at Tencent AI Lab. He was a Postdoc in IVUL lab
at KAUST, working with Prof. Bernard Ghanem, from August 2014 to November 2016. He received the PhD
degree from the National Laboratory of Pattern Recognition, Chinese Academy of Sciences (CASIA) in
2014, supervised by Prof. Baogang Hu. His research interests are machine learning and computer
vision, including probabilistic graphical models, adversarial examples, multi-label learning and
integer programming, etc. He has published 30+ top-tier conference and journal papers, including
TPAMI, IJCV, CVPR, ICCV, ECCV, AAAI, etc.
Carnegie Mellon University
Breaking the Sample Size Barrier in Model-Based Reinforcement Learning with a Generative Model
We investigate the sample efficiency of reinforcement learning in a
infinite-horizon Markov decision process (MDP) with state space
and action space
, assuming access to a generative model. Despite a number of prior work tackling
problem, a complete picture of the trade-offs between sample complexity and statistical accuracy
yet to be determined. In particular, prior results suffer from a sample size barrier, in the
that their claimed statistical guarantees hold only when the sample size exceeds at least
(up to some log factor).
The current paper overcomes this barrier by certifying the minimax optimality of model-based
reinforcement learning as soon as the sample size exceeds the order of
(modulo some log factor). More specifically, a
perturbed model-based planning algorithm provably finds an
. Along the way, we derive improved
(instance-dependent) guarantees for model-based policy evaluation. To the best of our knowledge,
this work provides the first minimax-optimal guarantee in a generative model that accommodates
entire range of sample sizes (beyond which finding a meaningful policy is information
This talk is based on joint work with Gen Li, Yuejie Chi, Yuantao Gu, Yuxin Chen.
Yuting Wei is currently an Assistant Professor in the Department of Statistics at Carnegie Mellon
University. Before joining Carnegie Mellon University, she spent one year as a Stein's Fellow /
Lecturer at Stanford Statistics Department. Yuting received her Ph.D. in Statistics at Berkeley
Statistics Department advised by Martin Wainwright and Aditya Guntuboyina. She was the recipient of
the 2018 Erich L. Lehmann Citation from the Berkeley statistics department for an outstanding Ph.D.
dissertation in theoretical statistics.
Information Constrained Optimal Transport: From Talagrand, to Marton, to Cover
The optimal transport problem studies how to transport one measure to another in the most
cost-effective way and has wide range of applications from economics to machine learning. In this
talk, we introduce and study an information constrained variation of this problem. Our study yields
a strengthening and generalization of Talagrand's celebrated transportation cost inequality.
Following Marton's approach, we show that the new transportation cost inequality can be used to
recover old and new concentration of measure results. Finally, we provide an application of this new
inequality to network information theory. We show that it can be used to recover almost immediately
a recent solution to a long-standing open problem posed by Cover regarding the capacity of the relay
Xiugang Wu is an assistant professor at the University of Delaware, where he is jointly appointed in
the Department of Electrical and Computer Engineering and the Department of Computer and Information
Sciences, and also affiliated with the Data Science Institute. Previously, he was a postdoctoral
fellow in the Department of Electrical Engineering at Stanford University, and received his Ph.D.
degree in Electrical and Computer Engineering from the University of Waterloo. His research
interests are in information theory, networks, data science, and the interplay between them. He is a
recipient of the 2017 NSF Center for Science of Information (CSoI) Postdoctoral Fellowship.
As the great development in various application fields, Deep Learning’s robustness has also
attracted increasingly attention. First, this talk will briefly introduce some famous attack and
defense approaches on Deep Learning. Then, considering the existing defense approaches mostly
require a prescribed attack strengthen, we propose a blind defense strategy, which can better
balance the accuracy and robustness of the learned model. Furthermore, we adapt our blind defense
approach with model compression and better balance the accuracy, efficiency, and robustness. Our
blind defense approaches can achieve a comprehensive robust model. We will introduce some ongoing
related projects at the end of this talk.
Xueshuang Xiang received his BSc in Wuhan university in 2009, and Ph.D. degree in Computational
Mathematics from the Academy of Mathematics and System Sciences, Chinese Academy of Sciences in
2014. He finished his Post-doctoral research from National University of Singapore in 2016. He is
currently an associate research fellow in Qian Xuesen Laboratory of Space Technology. His research
interests focus on the numerical PDE, Deep Learning and their combination.
Stochastic Linear Contextual Bandits with Diverse Contexts
In this talk, we investigate the impact of context diversity on stochastic linear contextual
bandits. As opposed to the previous view that contexts lead to more difficult bandit learning, we
show that when the contexts are sufficiently diverse, the learner is able to utilize the information
obtained during exploitation to shorten the exploration process, thus achieving reduced regret. We
design the LinUCB-d algorithm, and propose a novel approach to analyze its regret performance. The
main theoretical result is that under the diverse context assumption, the cumulative expected regret
of LinUCB-d is bounded by a constant. As a by-product, our results improve the previous
understanding of LinUCB and strengthen its performance guarantee.
Dr. Jing Yang is an assistant professor in the Department of Electrical Engineering, The
Pennsylvania State University, University Park, USA. She received her MS and PhD degrees from the
University of Maryland, College Park, and BS degree from the University of Science and Technology of
China, all in electrical engineering. She currently serves as an editor of the IEEE Transactions on
Wireless Communications and IEEE Transactions on Green Communications and Networking. Her research
interests lie in wireless communications and networking, statistical learning and inference, and
AI Chip Commercialization: From Application to Silicon
We have seen many domain specific architectures (DSAs) for AI computation but only a few of them has
gone into mass production. Despite of the quality and yield issue, complete and easy-to-use software
stack that maps application onto hardware is a major challenge. This talk presents our work on
designing the software stack for our DPU on FPGA. Several different interfaces are designed to allow
fast development for developers in different level, while neural network pruning and quantization
are embedded. Extendible instruction set is designed to tackle rare neural network architectures.
Backend optimization techniques are also introduced. Our work has been used as Xilinx Vitis AI, the
unified software stack recently announced by Xilinx.
Song Yao is the Senior Director of AI Business in Xilinx. Before joining Xilinx, he was the
cofounder and CEO of Deephi Tech, a startup focused on deep learning inference platform and
solutions, which was acquired by Xilinx in 2018. Song received BS degree from Department of EE,
Tsinghua University in 2015 with honors. He was also a visiting student in Stanford University in
2014. He received many awards including Best Paper Award of FPGA’17 conference, First Prize of
Technology Invention from China Computer Federation, and MIT Tech Review Under 35 Innovators Award.