Comments (8)
@adekusar-drl I am interesting in picking up this ticket. Here is what I am thinking right now:
- Add a method to
NeuralNetworkClassifier
that has a parametercategorical_cols
which is a list of column indices that are then one-hot encoded. unless there a way you can think of to convert categorical data automatically? - Call this method at the start of
fit
,predict
, andscore
.
Alternatively, I could implement this function in the dataset_helper script or somewhere else so it can be used in other models or by the user as a preprocessing step.
What do you think?
from qiskit-machine-learning.
Hi, in general you are right. So, first of all, please take a look at OneHotEncoder
from sklearn, I hope we can make use of it instead of implementing a conversion login from scratch. Next, I'm not sure we need categorical_cols
right away, we have a basic classifier that does not accept different type of labels, so should be easy to derive a label type from true labels.
Anyway, any suggestions are very welcome as some exploration is required here.
from qiskit-machine-learning.
Oh I see, I thought you were refering to categorical features not labels. That should be easy enough to infer and transform.
I will let you know if I have any other questions!
from qiskit-machine-learning.
I don't know why I was thinking of labels only. Categorical features can be also a case, but such feature would require even more exploration and investigation. There's also LabelEncoder
in sklearn that may help.
from qiskit-machine-learning.
Ya categorical features would be interesting but slightly more complicated. I think we would either need to have the user preprocess the data before instantiating the model to specify number of qubits or we would have to use PCA to force the encoding dimensionality to fit the number of qubits. Could be a cool task but might warrant its own issue/PR.
And yes I have been looking into sklearn's LabelEncoder
and OneHotEncoder
. Thanks for the recommendation.
from qiskit-machine-learning.
On the top level, I think, we should keep interfaces as simple as possible and, by default, users should not bother of number of qubits and so on. And in the same time the interfaces should be flexible, so experienced users can tweak model in a way they want. Yeah, feel free to split the work into two or more PRs.
from qiskit-machine-learning.
@adekusar-drl while we wrap this ticket up, I would be interested in exploring how to automatically handle categorical features, if you think that would be valuable. Otherwise I would be happy finding another issue to take a look at.
Thoughts?
from qiskit-machine-learning.
Certainly, you can explore categorical features, but, honestly, I don't know what you may hit on this road.
from qiskit-machine-learning.
Related Issues (20)
- Enhancement of PyTorch connector HOT 2
- Extend unit test coverage with `Hypothesis` in numerical tests
- Add `jit` compilation to the Torch connector with `thunder`
- Revamp `README.md` with structured information HOT 4
- Set up a security policy (@maintainers)
- Multi-class Classification Problem Using QSVC HOT 3
- Error when testing samples with labels other than {0, 1} in the MNIST dataset. HOT 6
- Revert CI environment to latest PyTorch once UTF bug is fixed
- Binary classification problem using NeuralNetworkClassifier and cross entropy loss HOT 1
- MacOS in CI - macos-latest is now ARM HOT 2
- Link Qiskit 1.0 migration instructions in Readme
- Add support for EstimatorV2 from ibm-qiskit-runtime to run circuits over hardware HOT 1
- Migrate `qiskit_algorithms` following end-of-support HOT 2
- Pinned `torch==2.2.2` breaks CI due to `numpy>=2.0`
- NeuralNetworkClassifier Accuracy Updates HOT 2
- Revert Numpy to the latest version in CI environment once UTF bug in PyTorch is fixed
- Restore mypy checks on Windows and lowest Numpy version
- The return values from SamplerQNN are in the wrong shape.
- Separate parameters for the trainable part and the encoding part in EstimatorQNN
- Mismatching between loss function code and documentation formula
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from qiskit-machine-learning.