gnot's People
gnot's Issues
How does GNOT work for time-dependent system?
Thanks a lot for your nice work!
I am wondering how this model applied for time-dependent system like NS2D since I dont see any detail in both github and the paper. Do you use autoregressive? Or directly change the input and output channel for time dimension.
I would appreciate your response.
Theta being concatenated with X?
In all your models, the model parameters u_p
is concatenated with the grid x
as an initial operation in any forward pass (before being passed through an MLP and then Attention Blocks). The model architecture diagram in the paper and readme indicates that theta should be processed as keys and values but wouldn't this conflate it as a query instead?
x = torch.cat([x, u_p.unsqueeze(1).repeat([1, x.shape[1], 1])], dim=-1)
Note: for the ns2d_1100_test.pkl
file (rectangular grid with circular cavities), the theta array has a value of all zeros too, what information is this holding? No initial conditions or boundary conditions (inlet outlet) are specified in the model or data either. Were these inferred through training only?
Implementation with Geo-FNO NACA dataset
Hi dear author,
Thanks for your paper and implementation.
Possibly a silly question, could you please provide more specific details about running the model using the NACA dataset? As for the NACA dataset used in FNO(-interp), directly using mask and Mach number as input and output may not fit into the model architecture.
About scaling experiments
Thank you for your excellent work! I want to ask the training setting of the scaling experiment in the paper. Should we keep same strategy for different training data size? I mean, if you use cycling strategy for each iter, the learning rate for each step would not be the same even if the cycling strategy is same for all experiments (Because the data size is different.).
Running HeatSink 3D
Hi,
First of all great work!
I would like to run this model on a similar example of yours: the heat3D sink. However I have a few questions:
- the data that you indicate to (heat3d) is just an array of shape: (1000, 19517, 5). I assume you use 5 functions, would you be so kind to elaborate on what functions/features you use in this array?
- The same link (heat3d) also contains a file "Heatsink_Output_XYZ.npy", but where can i find the ground truth?
- How do you generate 1000 samples? are they just different simulation scenarios?
Looking forward to your answer!
Multi output functions
I really like the concept of GNOT as it allows me to combine multiple inputs functions and a input vector of scalars but is it able to learn to multiple output functions?
Does padding in GNOT contaminate the attention matrix?
In NLP, we have mask machanism to help prevent this. But in GNOT, the following code in https://github.com/HaoZhongkai/GNOT/blob/master/models/cgpt.py seems no mask procedure
class LinearAttention(nn.Module):
"""
A vanilla multi-head masked self-attention layer with a projection at the end.
It is possible to use torch.nn.MultiheadAttention here but I am including an
explicit implementation here to show that there is nothing too scary here.
"""
def __init__(self, config):
super(LinearAttention, self).__init__()
assert config.n_embd % config.n_head == 0
# key, query, value projections for all heads
self.key = nn.Linear(config.n_embd, config.n_embd)
self.query = nn.Linear(config.n_embd, config.n_embd)
self.value = nn.Linear(config.n_embd, config.n_embd)
# regularization
self.attn_drop = nn.Dropout(config.attn_pdrop)
# output projection
self.proj = nn.Linear(config.n_embd, config.n_embd)
self.n_head = config.n_head
self.attn_type = 'l1'
'''
Linear Attention and Linear Cross Attention (if y is provided)
'''
def forward(self, x, y=None, layer_past=None):
y = x if y is None else y
B, T1, C = x.size()
_, T2, _ = y.size()
# calculate query, key, values for all heads in batch and move head forward to be the batch dim
q = self.query(x).view(B, T1, self.n_head, C // self.n_head).transpose(1, 2) # (B, nh, T, hs)
k = self.key(y).view(B, T2, self.n_head, C // self.n_head).transpose(1, 2) # (B, nh, T, hs)
v = self.value(y).view(B, T2, self.n_head, C // self.n_head).transpose(1, 2) # (B, nh, T, hs)
if self.attn_type == 'l1':
q = q.softmax(dim=-1)
k = k.softmax(dim=-1) #
k_cumsum = k.sum(dim=-2, keepdim=True)
D_inv = 1. / (q * k_cumsum).sum(dim=-1, keepdim=True) # normalized
elif self.attn_type == "galerkin":
q = q.softmax(dim=-1)
k = k.softmax(dim=-1) #
D_inv = 1. / T2 # galerkin
elif self.attn_type == "l2": # still use l1 normalization
q = q / q.norm(dim=-1,keepdim=True, p=1)
k = k / k.norm(dim=-1,keepdim=True, p=1)
k_cumsum = k.sum(dim=-2, keepdim=True)
D_inv = 1. / (q * k_cumsum).abs().sum(dim=-1, keepdim=True) # normalized
else:
raise NotImplementedError
context = k.transpose(-2, -1) @ v
y = self.attn_drop((q @ context) * D_inv + q)
# output projection
y = rearrange(y, 'b h n d -> b n (h d)')
y = self.proj(y)
return y
class LinearCrossAttention(nn.Module):
"""
A vanilla multi-head masked self-attention layer with a projection at the end.
It is possible to use torch.nn.MultiheadAttention here but I am including an
explicit implementation here to show that there is nothing too scary here.
"""
def __init__(self, config):
super(LinearCrossAttention, self).__init__()
assert config.n_embd % config.n_head == 0
# key, query, value projections for all heads
self.query = nn.Linear(config.n_embd, config.n_embd)
self.keys = nn.ModuleList([nn.Linear(config.n_embd, config.n_embd) for _ in range(config.n_inputs)])
self.values = nn.ModuleList([nn.Linear(config.n_embd, config.n_embd) for _ in range(config.n_inputs)])
# regularization
self.attn_drop = nn.Dropout(config.attn_pdrop)
# output projection
self.proj = nn.Linear(config.n_embd, config.n_embd)
self.n_head = config.n_head
self.n_inputs = config.n_inputs
self.attn_type = 'l1'
'''
Linear Attention and Linear Cross Attention (if y is provided)
'''
def forward(self, x, y=None, layer_past=None):
y = x if y is None else y
B, T1, C = x.size()
# calculate query, key, values for all heads in batch and move head forward to be the batch dim
q = self.query(x).view(B, T1, self.n_head, C // self.n_head).transpose(1, 2) # (B, nh, T, hs)
q = q.softmax(dim=-1)
out = q
for i in range(self.n_inputs):
_, T2, _ = y[i].size()
k = self.keys[i](y[i]).view(B, T2, self.n_head, C // self.n_head).transpose(1, 2) # (B, nh, T, hs)
v = self.values[i](y[i]).view(B, T2, self.n_head, C // self.n_head).transpose(1, 2) # (B, nh, T, hs)
k = k.softmax(dim=-1) #
k_cumsum = k.sum(dim=-2, keepdim=True)
D_inv = 1. / (q * k_cumsum).sum(dim=-1, keepdim=True) # normalized
out = out + 1 * (q @ (k.transpose(-2, -1) @ v)) * D_inv
# output projection
out = rearrange(out, 'b h n d -> b n (h d)')
out = self.proj(out)
return out
. So it seems the element in the attention matrix is contaminated by the padded part. Is it true? Thanks.
License for contributions and loading from checkpoints?
Hi Hao Zhongkai,
Cool library, what license is this under? I'd like to add some loading from checkpoint capability and I can see some attempt was made by yourself, anything to be aware of beforehand?
3D Heat Sink Case
Great piece of work here. I am particularly interested in the 3D heatsink case, could you please supply the 3D heatsink training data and case handling functions so I can replicate your results?
Getting Error while creating Conda Environment with requirements.txt
PackagesNotFoundError: The following packages are not available from current channels:
- einops==0.6.0
- dgl==1.0.1+cu116
- torch==1.10.0
- scikit_learn==1.2.1
Missing data
Hi,
I noticed that there are some missing mph data files in the shared Google drive.
Im interested in the 3D HeatSink data files or if that is not possible, the mph files to understand the format of the data.
This example is complicated and we omit the technical details here and they could be found in the mph source files.
Thank you!
Multi-GPU training?
I have adapted this using simple Data-Parallel from Pytorch, but the model seems to output ``nans sometimes. Have you been able to train this across multiple GPUs on a single node?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.