Refactor `attention.py` #1880

patrickvonplaten · 2023-01-01T18:16:15Z

attention.py has at the moment two concurrent attention implementations which essentially do the exact same thing:

diffusers/src/diffusers/models/attention.py

Line 256 in 62608a9

class AttentionBlock(nn.Module):

and
diffusers/src/diffusers/models/cross_attention.py

Line 30 in 62608a9

class CrossAttention(nn.Module):

Both

diffusers/src/diffusers/models/cross_attention.py

Line 30 in 62608a9

class CrossAttention(nn.Module):

and

diffusers/src/diffusers/models/attention.py

Line 256 in 62608a9

class AttentionBlock(nn.Module):

are already used for "simple" attention - e.g. the former for Stable Diffusion and the later for the simple DDPM UNet.

We should start deprecating

diffusers/src/diffusers/models/attention.py

Line 256 in 62608a9

class AttentionBlock(nn.Module):

very soon as it's not viable to keep two attention mechanisms.

Deprecating this class won't be easy as it essentially means we have to force people to re-upload their weights. Essentially every model checkpoint that made use of

diffusers/src/diffusers/models/attention.py

Line 256 in 62608a9

class AttentionBlock(nn.Module):

has to eventually re-upload their weights to be kept compatible.

I would propose to do this in the following way:

1. To begin with when
  
  diffusers/src/diffusers/models/attention.py
  
  Line 256 in 62608a9
  
  class AttentionBlock(nn.Module):
  
  is called the code will convert the weights on the fly to
  
  diffusers/src/diffusers/models/cross_attention.py
  
  Line 30 in 62608a9
  
  class CrossAttention(nn.Module):
  
  with a very clear deprecation message that explains in detail how to one can save & re-upload the weights to remove the deprecation message
1. Open a mass PR on all checkpoints that make use of
  
  diffusers/src/diffusers/models/attention.py
  
  Line 256 in 62608a9
  
  class AttentionBlock(nn.Module):
  
  (can be retrieved via the config) to convert the weights to the new format.

Happy to help with this PR. @williamberman are you maybe interested in taking this one?

The text was updated successfully, but these errors were encountered:

Birch-san · 2023-01-03T00:27:59Z

@patrickvonplaten rather than having everybody save & re-upload their weights: can diffusers intercept the weights during model load and map them to different parameter names?

Apple uses PyTorch's _register_load_state_dict_pre_hook() idiom to intercept the weights the state dict is being loaded, transform them and redirect them to be held in different Parameters:
https://github.com/apple/ml-ane-transformers/blob/da64000fa56cc85b0859bc17cb16a3d753b8304a/ane_transformers/huggingface/distilbert.py#L241

however, something about HF's technique for model-loading breaks this idiom. my model loading hooks never get invoked. they work in a CompVis repository, but not inside HF diffusers code. I think something about using importlib to load a .bin skips it. it'd be really good if you could fix that — it's the number one thing that made it difficult for me to optimize the Diffusers Unet for Neural Engine.

in the end, this is the technique I've had to resort to to replace every AttentionBlock with CrossAttention (after model loading):
Birch-san/diffusers-play@bf9b13e
you may find this as a useful reference for how to map between them.

williamberman · 2023-01-04T18:42:53Z

@Birch-san Thank you for the added context, super helpful! I don't have much to add right now. When I start working on the refactor, I'll think about it more and we can discuss :)

williamberman self-assigned this Jan 1, 2023

Dec	JAN	Feb
	08
2022	2023	2024

Refactor `attention.py` #1880

Refactor `attention.py` #1880

patrickvonplaten commented Jan 1, 2023

Birch-san commented Jan 3, 2023 •

edited

williamberman commented Jan 4, 2023

Refactor attention.py #1880

Refactor attention.py #1880

Comments

patrickvonplaten commented Jan 1, 2023

Birch-san commented Jan 3, 2023 • edited

williamberman commented Jan 4, 2023

Refactor `attention.py` #1880

Refactor `attention.py` #1880

Birch-san commented Jan 3, 2023 •

edited