Giter Site home page Giter Site logo

virtual_communities's Introduction

Fitness Communities: Building Online Well-being Experiences

General Introduction

Our project focuses on creating online fitness communities to support and enhance a great well-being experience. Online communities provide a conducive environment where members can share their knowledge, connect with like-minded individuals, and work together to achieve their fitness goals. Renowned brands such as Strava, Fitbit, and MyFitnessPal serve as successful examples of such online communities.

The objective of our project is to bring together people of all fitness levels within specific communities. These communities will enable members to share knowledge, network with peers, and engage in a learning, challenge, and motivation environment to enhance their overall well-being. To facilitate this, we harness artificial intelligence (AI) to analyze user data and integrate them into communities tailored to their profiles, goals, experience levels, workout types, coaches, and activities. Various AI techniques, including clustering, similarity analysis, and neural networks, are employed to create groups based on user characteristics such as fitness goals, experience levels, training profiles, and more.

Data Collection

To initiate our project, the first step involved data collection. Given the absence of a preexisting dataset suitable for our study, we decided to create our dataset using web scraping techniques. However, this approach presented certain challenges, necessitating answers to questions such as:

  • What variables should we extract to build our dataset?
  • What are the relevant topics or subjects to search for?
  • How should we create this dataset (i.e., which websites or platforms to use)?

In regards to the first question, our goal was to base information on customers' details, such as their age, gender, country, and a personal description outlining their well-being interests. However, it's difficult to find personal data (e.g., age, gender, country) on the Internet, and even if available, it's often limited to a small number of people. Yet, we wanted to create a larger dataset to obtain more relevant results (we considered generating this data randomly, but found that it would simply bias the results as these data are supplementary). Hence, we decided to collect only comments (as it's the most important variable for us to gain a clear understanding of a customer's interests and assign relevant communities) along with the authors' names (to group comments created by the same author and avoid any bias). Thus, our final dataset will consist of two columns: "author" and "comment."

Regarding the topics to search for, we identified three main subjects in the well-being domain: health, fitness, and nutrition. However, as we conducted research, we discovered other interesting sub-topics. Here's the final list of key topics we used: [- Fitness - nutrition - obesity - diabetes - smoking - intermittent fasting - vegetables and fruits - drink water - sugar challenge - hear loss - teeth health - heart disease - wake up early - lower back pain - eye disease - dry eye - blood pressure - nerves - Exercise routines for beginners - Yoga - Healthy meal plans and recipes - mental health - Running and endurance training - Strength training and muscle building - Sleep hygiene and the importance of rest - healthy lifestyle].

Now that we have defined our dataset structure and the topics to search for, we need to select the platforms to target for data collection. This step was particularly challenging because we had to find platforms that contained sharing spaces where people could express their views through comments. Furthermore, these comments had to be relevant, as we often encountered irrelevant content such as "ok" or "thanks." Thus, we had to look for platforms where relevant comments were available for the previously mentioned topics. After extensive research, we ultimately chose to work with YouTube platforms. Scarping code

Preprocessing

In this chapter, we will begin the preprocessing of our dataset, a crucial step that plays an essential role in preparing our raw data for the next stage. and to achieve this we use NLP techniques to optimize results preprocessing notebook

Creating clustering models

to achieve best results we tried many approch (during the numeric representation of comments or during model selection) to choose our final clustering model model building code

Technologies

  • Web scraping
  • NLP
  • Clustering

virtual_communities's People

Contributors

yassine-saoud avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.