deep

thinking

hour

UvA
ELLIS
VISLab

a series of talks on Deep Learning by experts from industry and academia

hosted at university of amsterdam


upcoming events

tutorial

speaker

David W. Romero

from

NVIDIA

time

Thu 11.04.2024 09:00-11:00 CET


livestream

zoom link

title Beyond Transformers: Exploring Subquadratic Long-Context Architectures

abstract Transformers are powerful but challenging to scale for tasks with long context due to their quadratic computational cost relative to context length. This limitation prompted the development of alternative architectures scaling sub-quadratically. This tutorial delves into recent developments in subquadratic long-context architectures, focusing on their foundations and mechanisms. Starting with State-Space Models (SSMs), particularly the S4 model, which combines recurrence and convolution. We then explore convolutional models like Hyena, Orchid, and CKConv, which don't rely on SSM formulation, and recent recurrent models like Mamba. Assessing strengths and limitations of each model family, we conclude with a look into future research directions. Attendees gain an understanding of modern subquadratic architectures' significance for Deep Learning applications.

past events

tutorial

speaker

Phillip Lippe

from

VISLab, UvA

time

Mon 11.03.2024 17:00-19:00 CET


livestream

zoom link

resources

title Training models at scale

abstract This tutorial equips you with the knowledge to efficiently train large models 🔥. We'll explore various distributed training strategies like fully-sharded data parallelism, pipeline parallelism, and tensor parallelism, alongside single-GPU optimizations including mixed precision training and gradient checkpointing. The tutorial will be framework-agnostic, so no prior knowledge in JAX or PyTorch is needed. By the end, you'll gain the skills to navigate the complexities of large-scale training.

tutorial

speaker

Alex Gabel

from

VISLab, UvA

time

Wed 06.03.2024 17:00-19:00 CET

title Differential geometry for deep learning

abstract Differential manifolds for machine learning researchers, covering fundamental concepts such as charts, partitions of unity, and fiber bundles. Emphasizing the construction of global structures from local properties, particularly in Euclidean space, the tutorial addresses advanced topics like differential forms and integration with applications to machine learning. Throughout, the tutorial underscores the importance of these mathematical tools in understanding complex data structures and improving modeling techniques, integrating references to practical applications within the field for researchers.

seminar

speaker

Rianne van den Berg

from

MSFT Research

time

Wed 10.05.2023 16:00 CET

location

L3.36 Lab 42 Science Park Ams

title AI4Science at Microsoft Research

abstract In July 2022 Microsoft announced a new global team in Microsoft Research, spanning the UK, China and the Netherlands, to focus on AI for science. In this talk I will discuss some of the research areas that we are currently exploring in AI4Science at Microsoft Research, covering topics such as drug discovery, material generation, neural PDE solvers, electronic structure theory. I will then dive deeper into two examples of projects recently done at Microsoft Research.

seminar

speaker

David Ruhe

from

AMLab, UvA

time

Wed 01.03.2023 16:00 CET

location

L3.36 Lab 42 Science Park Ams

title Geometric Clifford Algebra Networks

abstract In this talk, I explain our recently proposed Geometric Clifford Algebra Networks (GCANs) that are based on symmetry group transformations using geometric (Clifford) algebras. GCANs are particularly well-suited for representing and manipulating geometric transformations, often found in dynamical systems. Theoretical advantages are strongly reflected in the modeling of three-dimensional rigid body transformations as well as large-scale fluid dynamics simulations, showing significantly improved performance over traditional methods.

panel discussion

panelists

Jakub Tomczak1, Yuki Asano2, Efstratios Gavves2, Emiel Hoogenboom3

from

TUe1, UvA2, Google Brain3

time

Thu 19.01.2023 14:00 CET

location

L3.36 Lab 42 Science Park Ams

title Modelling versus scaling in modern Deep Learning

abstract What does it mean to accurately model using generative models; is it about building informative representations of real-world data? Do they allow us to investigate questions and ideas about the world that we couldn’t before? Recent foundation model developments - DALLE, Imagen, ChatGPT, GPT4 - seem to achieve incredible performance by leveraging enormous resources both in terms of computation and data. What are the limits of such data and compute scaling? Should (academic) researchers focus their attention on better scaling algorithms? Is there even any role left for modelling through inductive biases in this era of large-scale models? All this and more will be covered in this first edition of our panel discussion format, by an invited panel of influential researchers.

organisers

Samuele Papa s.papa@uva.nl

Riccardo Valpergar.valperga@uva.nl

David Knigged.m.knigge@uva.nl

The Deep Thinking Hour is a series of talks and panel discussions on advancements in Deep Learning, hosted at the University of Amsterdam. This initiative is supported by ELLIS.