It seems like the RAM consumption refers to the CPU RAM, right? However, GPU compu

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

GPU RAM consumption? about hyperlearn HOT 4 CLOSED

farleylai commented on May 18, 2024 1

GPU RAM consumption?

from hyperlearn.

Comments (4)

danielhanchen commented on May 18, 2024 1

Hi. Good Q. I try my best to reduce total memory usage. I am not just using computational tricks, but also mathematical tricks (ie - use more memory efficient algorithms). For K-Means and TSNE, yes ur right, memory can explode dramatically. TSNE for eg has to perform Barnes Hut first, which is essentially X*X.T for euclidean distances. Not very memory efficient at all. Likewise, K-Means can cause problems when updating the centroids --> minusing a matrix with it's centroids consumes double memory.

My aim is to reduce these temporary copies. But yes, I can deal with input so long as ur data can fit in ur memory.

Say when u upload data to RAM, it takes say 12 GB / 16 GB for eg. Then I have 4 GB to play with. If you use for eg Sklearn or Scipy's Linear Reg / lstsq, you will experience a crazy memory usage, causing memory issues. Using Cholesky Decomp, I reduced the overall memory usage by more than 50%.

Also, yes, I have planned to break down computations if the memory usage exceeds the total capacity. You can check Hyperlearn2, where I have started implementing the necessary memory checks before something is even run.

from hyperlearn.

danielhanchen commented on May 18, 2024

https://github.com/danielhanchen/hyperlearn/blob/master/hyperlearn2/base.py

I'm slowing writing decorators to first:

Convert dtype to lowest possible (int32 --> float32 not float64) etc
Check possible memory usage, and for now, tells the user theres a memory problem.
If possible, I try to use more memory efficient algos if memory is restrictive (eg: using GESVD instead of GESDD if memory is a problem)
In the future, I plan to perform batch processing. Say for Kmeans, instead of minusing in 1 go for each centroid, I will do it sequentailly.

HyperLearn's goal is to make ML faster, but also less resource intensive.

from hyperlearn.

farleylai commented on May 18, 2024

It has been quite a while. Since then, I managed to implement the batch processing to iteratively compute the metrics between GPU and CPU memories. It surely works. However, I believe this could be a general pattern for similar methods in view of ever growing large datasets nowadays. On the other hand, methods not requiring everything in the memory to compute all at once should be encouraged to use such s UMAP. That should make life easier.

from hyperlearn.

danielhanchen commented on May 18, 2024

@farleylai Heyy!! So sorry I closed your issue - we have a new Discord channel https://discord.gg/eJQzD4sH
We're repackaging the entire package and making it fully streamlined, much faster and supporting many more algos.

I agree on batch processing - processing iterative chunks between GPU GRAM and CPU RAM is a smart approach!

from hyperlearn.

GPU RAM consumption? about hyperlearn HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent