Giter Site home page Giter Site logo

arc-2024's Introduction

ARC 2024

Introduction

This is a project using the ARC 2024 competition dataset aiming to build an algorithm that is capable of solving novel abstract reasoning tasks that it has never seen before. This is the crux of artificial general intelligence (AGI) and is a component of AI systems that can learn new skills and solve open-ended systems, rather than AI systems that simply memorize data.

Unsupervised Learning

This notebook focuses on unsupervised learning. It uses dimensionality reduction and non-linear manifold learning techniques on the dataset, then performs clustering, to prepare the dataset for further analysis. It also trains a neural network on it as a baseline.

Dataset

The ARC dataset can be downloaded from the Kaggle competition page. Ensure you have the dataset downloaded and placed in the appropriate directory within the project. The link is here: https://www.kaggle.com/competitions/arc-prize-2024/data.

Running the Code

  1. Launch Jupyter Notebook:
jupyter notebook
  1. Open the Notebook: Navigate to the notebook file 'arc-ml-a3-v2.ipynb' in the Jupyter Notebook interface and open it.

  2. Run the Notebook:

Follow these steps within the notebook:

  • Data Loading: Ensure the ARC dataset is loaded correctly.
  • Data Preprocessing: Follow the preprocessing steps to flatten nested lists (arrays) and perform any necessary transformations.
  • Clustering: Apply K-Means and Expectation Maximization (EM) clustering algorithms to the dataset.
  • Dimensionality Reduction: Apply PCA (Principal Component Analysis), ICA (Independent Component Analysis), and RP (Random Projection) to transform the dataset.
  • Non-linear Manifold Learning: Use t-SNE (t-Distributed Stochastic Neighbor Embedding), MDS (Multidimensional Scaling), Spectral Embedding, and UMAP (Uniform Manifold Approximation and Projection) for comparison and visualization.
  • Re-apply Clustering: Run clustering algorithms on the transformed datasets to evaluate the impact of these transformations on clustering performance.
  • Neural Network Models: Train neural network models on the transformed datasets, using clusters as new features, and evaluate the performance using accuracy, learning curves, and training time.
  • Evaluation: Use the Silhouette Score to evaluate clustering performance, considering cohesion within clusters and separation between clusters.

For more information on manifold learning on scikit-learn, see the section on their website: https://scikit-learn.org/stable/modules/manifold.html.

Results

Results will be displayed within the Jupyter Notebook, including:

  • Clustering results visualized for both original and transformed datasets.
  • Silhouette Scores for different clustering and dimensionality reduction techniques.
  • Neural network performance metrics.

Future Work

Further research may include:

  • Exploring advanced clustering methods like hierarchical clustering and DBSCAN.
  • Fine-tuning hyperparameters using Bayesian optimization.
  • Incorporating ensemble methods and transformer-based embeddings.
  • Investigating reinforcement learning for abstract reasoning tasks.

License

This project is licensed under the MIT License - see the LICENSE file for details.

arc-2024's People

Contributors

park-jsdev avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.