Dear all, I used TGAN in PyPI (ver 0.1.0) and found that column name

Thanks for reporting this <a class="user-mention notranslate" data-hovercard-type="use

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Column names are replaced by index about tgan HOT 4 OPEN

sdv-dev commented on May 29, 2024

Column names are replaced by index

from tgan.

Comments (4)

csala commented on May 29, 2024 1

Thanks for reporting this @upura

The problem seems to be here: https://github.com/DAI-Lab/TGAN/blob/f5b9a9cbd9e4bc2f0755bdcf24daef537594cd72/tgan/data.py#L322

The fix would be to avoid replacing the column names, and rather use an enumerate on the subsequent loop to get the right i value without having to alter the data object.

from tgan.

ManuelAlvarezC commented on May 29, 2024

Hi @upura and thanks for your question.

the trouble is that the function fit_transform is called several times

I haven't been able in which case that can occur, could you please provide a snippet of code that reproduce your issue?

Thanks.

from tgan.

upura commented on May 29, 2024

Hello @ManuelAlvarezC, thank you for your reply.

Here is the notebook I used. Sorry for the Japanese comment. After fitting TGAN, it looks that column names are replaced (at cell [16]).
https://github.com/upura/upura.hatenablog/blob/master/books_sites/tgan/tgan-titanic.ipynb

Now I rechecked the codes and I've found that what I said is wrong. But I still can't see why column names are replaced.

the trouble is that the function fit_transform is called several times

Best.

from tgan.

hasinaattaullah commented on May 29, 2024

@csala can you please bit elaborate how exactly we will do it?
"The fix would be to avoid replacing the column names, and rather use an enumerate on the subsequent loop to get the right i value without having to alter the data object."
As I am also working on it and I am getting error ,
ValueError Traceback (most recent call last)
in
----> 1 tgan.fit(data)

~\Anaconda3\lib\site-packages\tgan\model.py in fit(self, data)
678 """
679 self.preprocessor = Preprocessor(continuous_columns=self.continuous_columns)
--> 680 data = self.preprocessor.fit_transform(data)
681 self.metadata = self.preprocessor.metadata
682 dataflow = TGANDataFlow(data, self.metadata)

~\Anaconda3\lib\site-packages\tgan\data.py in fit_transform(self, data, fitting)
328 if i in self.continuous_columns:
329 column_data = data[i].values.reshape([-1, 1])
--> 330 features, probs, means, stds = self.continous_transformer.transform(column_data)
331 transformed_data['f%02d' % i] = np.concatenate((features, probs), axis=1)
332

~\Anaconda3\lib\site-packages\tgan\data.py in decorated(self, data, *args, **kwargs)
61 raise ValueError('The argument data must be a numpy.ndarray with shape (n, 1).')
62
---> 63 return function(self, data, *args, **kwargs)
64
65 decorated.doc = function.doc

~\Anaconda3\lib\site-packages\tgan\data.py in transform(self, data)
238 """
239 model = GaussianMixture(self.num_modes)
--> 240 model.fit(data)
241
242 means = model.means_.reshape((1, self.num_modes))

~\Anaconda3\lib\site-packages\sklearn\mixture\base.py in fit(self, X, y)
192 self
193 """
--> 194 self.fit_predict(X, y)
195 return self
196

~\Anaconda3\lib\site-packages\sklearn\mixture\base.py in fit_predict(self, X, y)
218 Component labels.
219 """
--> 220 X = _check_X(X, self.n_components, ensure_min_samples=2)
221 self._check_initial_parameters(X)
222

~\Anaconda3\lib\site-packages\sklearn\mixture\base.py in _check_X(X, n_components, n_features, ensure_min_samples)
53 """
54 X = check_array(X, dtype=[np.float64, np.float32],
---> 55 ensure_min_samples=ensure_min_samples)
56 if n_components is not None and X.shape[0] < n_components:
57 raise ValueError('Expected n_samples >= n_components '

~\Anaconda3\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
571 if force_all_finite:
572 _assert_all_finite(array,
--> 573 allow_nan=force_all_finite == 'allow-nan')
574
575 shape_repr = _shape_repr(array.shape)

~\Anaconda3\lib\site-packages\sklearn\utils\validation.py in _assert_all_finite(X, allow_nan)
54 not allow_nan and not np.isfinite(X).all()):
55 type_err = 'infinity' if allow_nan else 'NaN, infinity'
---> 56 raise ValueError(msg_err.format(type_err, X.dtype))
57
58

ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

from tgan.

Column names are replaced by index about tgan HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent