BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Claremont Center for the Mathematical Sciences - ECPv6.15.17.1//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:Claremont Center for the Mathematical Sciences
X-ORIGINAL-URL:https://colleges.claremont.edu/ccms
X-WR-CALDESC:Events for Claremont Center for the Mathematical Sciences
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/Los_Angeles
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20240310T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20241103T090000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20250309T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20251102T090000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20260308T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20261101T090000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20251201T161500
DTEND;TZID=America/Los_Angeles:20251201T171500
DTSTAMP:20260506T175155
CREATED:20251126T233248Z
LAST-MODIFIED:20251126T233248Z
UID:3935-1764605700-1764609300@colleges.claremont.edu
SUMMARY:Structure-Aware Adaptive Nonconvex Optimization for Deep Learning and Scientific Computing (Minxin Zhang\, UCLA)
DESCRIPTION:Abstract: Modern machine learning and scientific computing pose optimization challenges of unprecedented scale and complexity\, demanding fundamental advances in both theory and algorithmic design for nonconvex optimization. This talk presents recent advances that address these challenges by exploiting matrix and tensor structures\, integrating adaptivity\, and leveraging sampling techniques. In the first part\, I introduce AdaGO\, a new optimizer that combines orthogonalized momentum updates with adaptive learning rates. Building on the recent success of the Muon optimizer in large language model training\, AdaGO incorporates an AdaGrad-type stepsize that scales orthogonalized update directions by accumulated past gradient norms. This design preserves the structural advantage of orthogonalized updates while adapting stepsizes to noise and the optimization landscape. We establish optimal convergence rates for smooth nonconvex functions and demonstrate improved performance over Muon and Adam on classification and regression tasks. The second part focuses on zeroth-order global optimization. We develop a theoretical framework for inexact proximal point (IPP) methods for global optimization\, establishing convergence guarantees when proximal operators are estimated either deterministically or stochastically. The quadratic regularization in the proximal operator induces a concentrated Gibbs measure landscape that facilitates effective sampling. We propose two sampling-based algorithms: TT-IPP\, which constructs a low-rank tensor-train (TT) approximation using a randomized TT-cross algorithm\, and MC-IPP\, which employs Monte Carlo integration. Both IPP algorithms adaptively balance efficiency and accuracy in proximal operator estimation\, achieving strong performance across diverse benchmark functions and applications. Together\, these works advance structure-aware adaptive first-order optimization for deep learning and zeroth-order global optimization in scientific computing.
URL:https://colleges.claremont.edu/ccms/event/structure-aware-adaptive-nonconvex-optimization-for-deep-learning-and-scientific-computing-minxin-zhang-ucla/
LOCATION:Emmy Noether Room\, Estella 1021\, Pomona College\,\, 610 N. College Ave.\, Claremont\, CA\, 91711\, United States
CATEGORIES:Applied Math Seminar
ORGANIZER;CN="Ryan Aschoff":MAILTO:ryan.aschoff@cgu.edu
END:VEVENT
END:VCALENDAR