Comments (4)
It is a good idea, but there does not exist a score that would fit all possible types of analyses. For example, for a non-parametric analysis, I may not care about outliers, so I won't want that included in my score function. If you decide to implement it, I would suggest allowing the user to pass the score function as an option
pandas_profiling.ProfileReport(df, score = my_score_function)
with some number of built in choices.
from ydata-profiling.
Rather than open another item, I'd like to suggest/enquire about an idea I'm working on (but my skill are not so hot) so hoping maybe someone better than I could pick and run:
I love Profiling and it is now my go to for any new dataset.
What I would really love is the ability to have it auto compare two or three target variables vs the dependent ones(all the others..) so for instance we have a file with males and females we want to compare the age frequency (needs to be as a % of the selected sub group) similarly if we broke down the males and females by the state they live in.., count and normalize and plot... And so on.. A bit tricky for categorical where we need to count and normalize them.. Also might as well do the covariance, and rank the variables by covariance. Would be hugely valuable for initial look sees when beginning to do any machine learning.. Hopefully that makes sense.. I'm slowly piddling with it for a specific file so then maybe I can learn to generalize it..MAybe someone else is way faster and better than I and also wants the same!
from ydata-profiling.
Doing target profiling is definitely high on the priority list, see #10
Once that is implemented, it would be easier to add a list of target variables instead of just one.
from ydata-profiling.
Stale issue
from ydata-profiling.
Related Issues (20)
- no module named "pydantic.v1" HOT 1
- Bug Report
- Feature Request HOT 1
- Bugging creation of report
- Bug Report: Comparing reports from Spark HOT 1
- Upgrade Visions library
- 'NoneType' object has no attribute 'replace' HOT 4
- Crashes with memory leak, seems to be deadlock related HOT 1
- Feat: Use ibis as single backend HOT 3
- No module named 'scipy.stats._mvn' error when importing ProfileReport HOT 2
- Feature Request: use CJK (non-ascii) character
- Bug Report HOT 1
- Bug Report: ValueError: NaTType does not support strftime HOT 1
- Bug Report: DispatchError: Function <code object pandas_missing_bar HOT 2
- Bug Report: Confusing Error with Geometry Column HOT 1
- AttributeError: module 'numba' has no attribute 'generated_jit' HOT 5
- Categorical Variable showing as word cloud instead of bar
- Feature Request | Telemetry
- Bug Report: ydata-profiling won't work in Azure Synapse HOT 3
- Report is too large for any browser to render HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ydata-profiling.