Comments (2)
When it comes to tf-idf, by definition it applies to documents within a corpus, so you have to think about what you mean by "document" and "corpus" to have tf-idf be a meaningful statistic that is helpful in your analysis. If you are looking at your survey responses and you want to have a sense of the overall most important words (not differences by person) you may just want to look at word frequencies, the most common words used in the survey responses.
If you have survey responses, then that would be the "feature" that you would want to use for pairwise_count()
and pairwise_cor()
. You wouldn't want to count up or compute the correlation of words that separate survey respondents used. You'll end up with correlations or co-occurrences like this; that analysis was done with exactly this kind of per-survey-respondent counting.
from tidy-text-mining.
Thank you, that is very helpful!!
from tidy-text-mining.
Related Issues (20)
- preprocessing omission in sample code 6.2 for "The War of the Worlds" HOT 4
- different output for cor.test() in 1.5 HOT 3
- Sentiment lexicons have changed HOT 6
- Chapter 1 missing introduction on getting self-generated texts into R HOT 2
- Error in Section 1.5 - gutenbergr Package
- Removing stop words-> which ones have I removed? HOT 2
- Topic labeling with Mutual Information HOT 2
- Comparing word frequencies 3 ways HOT 2
- Update for new tidyr HOT 1
- Broken code 9.1 HOT 4
- Version of Pride & Prejudice from Project Gutenberg has "Chapter" issues HOT 1
- Avoid adding columns with other functions HOT 1
- Evolve facets from traditional tilde notation to vars() HOT 1
- 9.1 Preprocessing Error HOT 2
- Replace superseded top_n with slice_min/slice_max
- Feature Request: Images of the data you are working with throughout the book HOT 2
- possible error with beginning of 'Case study: analyzing usenet text' HOT 2
- qualitative research HOT 2
- tm.plugin.webminin is no longer working for Chapter 5.3.1 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tidy-text-mining.