Giter Site home page Giter Site logo

plurigrid / ontology Goto Github PK

View Code? Open in Web Editor NEW
4.0 3.0 9.0 9 MB

autopoietic ergodicity and embodied gradualism

Home Page: https://vibes.lol

HTML 0.46% SCSS 0.03% Python 0.32% Nix 0.02% JavaScript 97.40% CSS 1.16% Just 0.02% Logos 0.19% Scheme 0.07% Shell 0.04% TeX 0.28%
agency

ontology's Introduction

v0.4: Amelia

  • obtain the arena from github main, ontology/arena.json which will follow the arena.json.schema
  • in a new branch, OPEN A DRAFT PULL REQUEST IMMEDIATELY to add log/yyyy-mm-dd-name-<focus>.md
  • navigate to plurigrid.game OR localhost "just play"
  • you will see the Plurigrid UI
  • add a new tree - the System prompt will already have the necessary primitives to help you along the way
  • as instructed by the textbox, paste the arena.json from GitHub ontology repo and begin going through the tasks assigned to you (you can ask !filter: Name) to use your name only
  • as you go through the day, push your arena state intoe your branch (for which you opened a draft PR)
  • iterate: in a separate tree on significant tasks - your goal is to fit your entire gm-gn iteration into the same tree, which means staying within the context window
  • compose: as your day is unfolding, others will be working as well - they will be pasting their arena into their branch under log/yyyy-mm-dd-name-<focus>.md ask Plurigrid to reconcile your arenas after pasting them one by one in the same prompt
  • merge the PR with the COMPOSED arena.json ONLY AFTER THE TEAM APPROVES

Ontology

INVITATION: EACH TIME YOU PLAY AND IT MADE YOUR DAY BETTER OR MADE YOU SMILE, LEAVE THIS BETTER THAN YOU HAD FOUND IT, FOR WHEN WE ALL PLAY AGAIN. 🌳

Playbook 😏

Step 1: Begin Your Journey (gm -> gn)

Start your loop at the beginning of the day with the agent -- send a "GM" and one of our preset prompts to iterate on an idea whose results you then compose, and share your goals for your work. This should be done in https://plurigrid.game open game* to facilitate bidirectional communication.

Step 2: Iterative & Adaptive Development Loops (Journey, Iterate, Compose)

Journey Loop (Your Own Sense): Monitor and fine-tune your progress, adapting as needed. Embrace a human-in-the-loop approach, learning from other team members and the groundbreaking AI founder concept.

Iterate Loops (Our Work Together): Collaborate with teammates to contribute and continuously improve Plurigrid's shared components and resources, as well as learn from others' experiences and findings. Use the Iterate Loops for tasks that are unfinished or ongoing, and seek feedback and support from your peers.

Compose Loops (WAGMI share with your Plurigrid): As you complete tasks and make progress, integrate and combine your work with that of your teammates in the Compose stage. This fosters collective growth and supports the composition of our ontologies. Compose is for tasks that are essentially done, facilitating the integration of everyone's output.

Reflect on your achievements, challenges, and learnings each day. Share your daily summaries to contribute to the collective growth of the Plurigrid team. By embracing iterative development and adapting as needed, together, we grow stronger and more capable.

Step 3: Keep Looping & Collaborate

Leverage the OpenAgency framework for brainstorming in your work, collaborate using platforms operating on interoperable data formats, and using any tool that complies with working on the data streams and is in compliance with Digital Public Goods framework. As you progress, contribute your improvements to the Plurigrid ontology on an ongoing basis. The feedback loop at Plurigrid is continuous, and the ones rewarded are those playing next to the top.

Step 4: Close The Loop

At the end of the play/coplay session, review your progress made during the session. Update the action items based on the session results. Plan upcoming tasks to maintain momentum. Integrate any new frameworks or concepts that emerged during the session. Update the Plurigrid ontology with any artifacts or outputs of your work by using the Obsidian git plugin. By following the Play / Coplay framework, you'll efficiently collaborate and contribute to Plurigrid Inc. Fostering a robust, interconnected, and evolving system that can tackle the challenges of decentralization on a multi-planetary scale. Welcome aboard! 🌌

Introduction

Welcome to the Plurigrid Protocol, where our mission is to create a decentralized energy platform by harnessing the power of collective intelligence. In order to achieve this goal, it is vital that all players in the network follow the play-coplay process. This process not only functions as a blueprint for our collaborative efforts but also mirrors the autopoietic nature of the plurigrid protocol itself.

By engaging in this play-coplay process, we navigate the complex realms of generative channels (the creation of outcomes) and recognition channels (understanding the connections between these outcomes) through a combination of learning, adaptation, and feedback. This ensures that our network evolves in a gradual and embodied manner––all while maintaining a harmonious balance between individual and collective progress.

Drawing inspiration from the principles of autopoietic ergodicity, Markov categories, and game theory, the play-coplay process enables us to effectively coordinate our efforts and make strides towards a more efficient and resilient decentralized energy platform.

Embarking on this journey together, we are not only building a strong foundation for the Plurigrid Protocol but also embodying its core principles in the way we collaborate, learn, and grow. Join us in embracing the play-coplay process, and let's revolutionize the world of decentralized energy systems—one interaction at a time.

Welcome to Plurigrid Protocol! 🚀

This README will guide you through the process of setting up your development environment and getting started with contributing to the Plurigrid Protocol. Our goal is to ensure a smooth onboarding experience, so you can quickly become an active member of our project.

Table of Contents

Getting Started

Before diving into Plurigrid Protocol, we recommend familiarizing yourself with the background theory and checking out our playbook to have a better understanding of the project's principles and guidelines.

Installation

CLI

Set up your development environment and start contributing.

curl -L https://nixos.org/nix/install | sh
nix-env -iA just && just play

To debug the open game:

nix-shell
poetry shell
python scripts/http.py

If you want to exit the shell environment and return to your original shell, you can simply type exit or press Ctrl-D.

Obsidian

To prepare your pluralistic interface in Obsidian, simply use the appropriate template under .playback.

Then run just obsidian.

See the obsidian installation instructions below for more details.

Setting Up Obsidian and Plugins

  1. Set up Obsidian
  1. Set up the Smart Connections Plugin
  • Go to Settings (gear icon) > "Community plugins" > "Turn off Restricted Mode" Then, "Browse community plugins".
  • Search for "Smart Connections" and install the plugin.
  • Activate the Smart Connections plugin in the "Installed plugins" tab.
  • Configure the plugin to your preferences and start creating AI-powered note connections by following the plugin's documentation: https://github.com/mgmeyers/obsidian-smartconnections
  • Make sure you set your Open AI API key for GPT-4 in the plugin settings.
  1. Set Up Obsidian Git Plugin
  • Install the Obsidian Git plugin following the steps for the previous section, but search for the "Obsidian Git" plugin.
  • Plugin documentation can be found here: https://publish.obsidian.md/git-doc/01+Start+here
  • Follow the installation and authentication steps from the documentation.
  • Configure Obsidian Git settings: Go to Settings > Plugins Options > Obsidian Git to configure your settings. For example settings, see .config/data.json in this repository and copy it over to .obsidian/plugins/obsidian-git/data.json.

Workflow with Obsidian Git and Smart Connections

  • Before you begin, run the "Obsidian Git: Pull" command to make sure your workspace is up to date.
  • When you start your journey, create a new branch that is topical to the journey. Use the "Obsidian Git: Create new branch" command.
  • With the default settings, your work should automatically be pushed and merged every 60 minutes. Your repository will also be automatically pulled on the same cadence.
  • As you contribute to ontology, you can use the "Smart Connections: Smart Chat conversation" command to ask an agent questions over your data. Use the ontology agent as a guide to help your personal loops.

Background Theory

Autopoietic Ergodicity: A Foundation for Embodied Gradualism

Autopoietic ergodicity encompasses the ability of Plurigrid systems to self-organize, adapt, and evolve in diverse and ever-changing environments. This concept is grounded in two core principles: autopoiesis, referring to the self-maintenance and self-regulation of a system, and ergodicity, which deals with the equivalence between time and ensemble averages in an interoperable system.

In Plurigrid development, autopoietic ergodicity is crucial in capturing and understanding systems' dynamic interactions with their environments, allowing them to achieve long-term stability and maintain relevance as situations change over time. By ensuring that the learning and adaptation processes of Plurigrid systems align with the principles of autopoietic ergodicity, developers can create systems that continuously evolve and foster an environment in which many worlds can not only co-exist but thrive together.

Open Games and Markov Category

Play / Generative Channel

A generative channel, also known as a generative model or stochastic channel, is a mathematical construct that models the process of generating data or outcomes according to some specified underlying probability distribution. It captures the dependencies and relationships between variables, such as input features and output labels in a data-driven system or between players' strategies and outcomes in a game theory setting.

In the context of a Markov category, a generative channel can be represented as a morphism between objects, where objects capture the structure of probability spaces, and morphisms represent stochastic processes or conditional probability distributions. The composition of morphisms in a Markov category embodies the concept of sequential stochastic processes, where the output of one channel serves as the input for the next.

Generative channels are used to model a wide range of systems in various domains, including machine learning, statistics, and game theory. By analyzing the properties of these channels, one can draw inferences about the underlying processes, predict future outcomes, or optimize the design of a system. In the context of game theory, generative channels can be used to model the dependencies between player strategies, game states, and payoffs, allowing for a deeper understanding of the dynamics of strategic interactions in a game.

Co-Play / Recognition Channel

A recognition channel, also referred to as a recognition model or inference model, is a mathematical construct used to model the process of inferring or estimating the underlying latent variables or parameters from observed data or outcomes. It captures the probabilistic relationship between the observed variables and the latent variables and serves as the inverse of a generative channel or generative model.

In the context of a Markov category, a recognition channel can be represented as a morphism between objects, where objects capture the structure of probability spaces, and morphisms represent stochastic processes or conditional probability distributions. The composition of morphisms in a Markov category embodies the concept of sequential stochastic processes, where the output of one channel serves as the input for the next.

Recognition channels play a significant role in various fields, including machine learning, statistics, and game theory. In machine learning, recognition channels are often used for variational inference and learning, where the goal is to approximate an intractable posterior distribution of latent variables given observations. In game theory, recognition channels can be employed to model the players' beliefs about other players' strategies based on observed actions, which can be useful in understanding and predicting the behavior of players in strategic interactions.

Learning

Together with generative channels, recognition channels form an essential part of the learning and inference process. They enable a systematic translation and understanding of the relationships between observable data and hidden variables or parameters that govern the underlying processes in a system.

ontology's People

Contributors

amacdawg avatar blue-note avatar bmorphism avatar cornbread5 avatar enfascination avatar github-merge-queue[bot] avatar gsspdev avatar irlydontcodelmao avatar jonnyjohnson1 avatar lucas-chu avatar robama38 avatar ryan-chase avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

ontology's Issues

implement grid_coalition following the example of ai_society, but within plurigrid context

Including this in ontology so as to capture this as the primary artifact whose delivery in the context of several nexus devices engaging in agentic amplification of their respective owner(s).

https://github.com/lightaime/camel/blob/master/examples/ai_society/role_playing_multiprocess.py

Microworlds are to be understood as simply instances of https://github.com/chatarena/chatarena#creating-your-custom-environment with grounding in electricity grid physics.

Prioritized Replay - Gromov Wasserstein Minibatch Sampling + Update Function

import torch 
import torch.nn as nn 
import torch.nn.functional as F 
import numpy as np

import scipy
from scipy.optimize import linear_sum_assignment
from scipy.spatial.distance import cdist

import os 
import gym 
import random 
import queue
import collections 

class PrioritizedExperienceReplay:
    def __init__(self, buffer_size, alpha, beta, epsilon):
        self.buffer_size = buffer_size 
        self.alpha = alpha 
        self.beta = beta 
        self.epsilon = epsilon 

        self.buffer = collections.deque(maxlen = self.buffer_size)
        self.priorities = collections.deque(maxlen = self.buffer_size)
        self.priorities_sum = 0.0 
        self.max_priority = 1.0 
    
    def add_experience(self, experience, priority):
        self.buffer.append(experience)
        self.priorities.append(priority)
        self.priorities_sum += priority 
        
        if priority > self.max_priority:
            self.max_priority = priority 
    
    def minibatch(self, batch_size):
        priorities = np.array(self.priorities)
        probabilities = priorities ** self.alpha / self.priorities_sum 
        indices = np.random.choice(len(self.buffer), size = batch_size, p = probabilities)
        weights = (len(self.buffer) * probabilities[indices]) ** (-self.beta)
        weights /= np.max(weights)
        batch = [self.buffer[i] for i in indices]
        return batch, indices, weights 
    
    def update_priorities(self, indices, errors):
        for i, error in zip(indices, errors):
            priority = (error + self.epsilon) ** self.alpha 
            self.priorities_sum -= self.priorities[i]
            self.priorities[i] = priority 
            self.priorities_sum += priority 
            self.max_priority = max(self.max_priority, priority)

    def compute_cost_matrix(self, C, p, q):
        p_matrix = np.tile(p, len(q), 1).T
        q_matrix = np.tile(q, len(p), 1)
        M = C*p_matrix * q_matrix  
        return M
    
    def compute_GW_distance(self, X, Y, p, q):
        C = cdist(X, Y)
        M = self.compute_cost_matrix(C, p, q)
        row_ind, col_ind = linear_sum_assignment(M)
        GW_distance = M[row_ind, col_ind].sum() / p.sum()
        return GW_distance

    def sample_idx(self, batch_size):
        probabilities = np.array(self.priorities) ** self.alpha 
        probabilities /= np.sum(probabilities)
        idx = np.random.choice(len(self.buffer), size = batch_size, replace = False, p = probabilities)
        return idx 

    def minibatch_GW(self, batch_size, X, p):
        idx = self.sample_idx(batch_size)
        batch = [self.buffer[i] for i in idx] 
        Y = [experience.state for experience in batch]
        q = np.ones(len(Y)) / len(Y)
        GW_dist = self.compute_GW_distance(X, Y, p, q)
        return batch, idx, GW_dist 

    def update_priorities_GW(self, indx, gw_dists):
        for i, gw_dist in zip(indx, gw_dists):
            priority = (gw_dist + self.epsilon) ** self.alpha 
            self.priorities_sum -= self.priorities[i]
            self.priorities[i] = priority 
            self.priorities_sum += priority 
            self.max_priority = max(self.max_priority, priority)

    def get_max_priority(self):
        return self.max_priority 

    def __len__(self):
        return len(self.buffer)

Integrated CityLearn Reward Function with Temporal Dependency

import torch 
import torch.nn as nn 
import torch.nn.functional as F 
import torch.distributions as distributions

import matplotlib.pyplot as plt 
import numpy as np 

import gym
from citylearn import GridLearn
print(gym.__version__)

from entropyUtilities import *

class customEnvCityLearn(gym.Env):
    def __init__(self, max_timesteps, n_agents, weather_file, building_attributes):
        super().__init__()

        self.timestep = 0
        self.max_timesteps = max_timesteps 
        self.n_agents = n_agents 

        self.past = np.random.randint(0, 2, size = 1)
        self.present = np.random.randint(0, 2, size = 1)
        self.future = np.random.randint(0, 2, size = 1)

        self.tou_periods = [(0, 8), (8, 16), (16, 24)]
        self.tou_prices = [0.1, 0.2, 0.3]
        self.dr_event_start = 500
        self.dr_event_end = 550
        self.dr_event_percent_reduction = 0.1
        
        self.grid = GridLearn(weather_file, building_attributes_file = building_attributes)

    def step(self, actions):
        self.past = np.random.randint(self.past, self.present, axis = 0)
        self.present = np.append(self.present, self.future, axis = 0)
        self.future = np.append(self.future, actions, axis = 0)
        self.timestep += 1

        tou_period = self.timestep % 24 // 8
        tou_price = self.tou_prices[tou_period]

        if self.dr_event_start <= self.timeActustep < self.dr_event_end:
            demand_reduction = self.dr_event_percent_reduction
        else:
            demand_reduction = 0
        
        actions_scaled = actions * self.grid.buildings['Electricity'].peak_power / 2
        self.grid.step(actions_scaled, tou_energy_prices = [tou_price] * 3,
                       demand_response = demand_reduction)
        
        tau = self.present 
        s = self.past 
        t = self.future
        I_tau_sx = information_mutual_conditional(s, t, tau)
        I_tau_sx_shared = information_mutual(s, t)
        I_tau_sx_excel = I_tau_sx - I_tau_sx_shared 

        action_counts = np.bincount(actions, minlength = 2)
        action_probabilities = action_counts / len(actions)
        diversity_penalty = -np.sum(action_probabilities * np.log(action_probabilities))
        reward = -I_tau_sx_excel + diversity_penalty

        done = (self.timestep >= self.max_timesteps)
        obs = self.grid.get_state()[0]['Electricity']['consumption'].flatten()
        return obs, reward, done, {}

    def reset(self):
        self.timestep = 0
        self.past = np.random.randint(0, 2, size = 1)
        self.present = np.random.randint(0, 2, size = 1)
        self.future = np.random.randint(0, 2, size = 1)
        self.grid.reset()

def main(max_timesteps, n_agents, weather_file, building_attributes_file):
    env = customEnvCityLearn(max_timesteps = max_timesteps, n_agents = 1,
            weather_file = weather_file,
            building_attributes = building_attributes_file)

    obs = env.reset()
    done = False 
    cumulative_reward = 0

    while not done:
        action = np.random.randint(0, 2, size = 1)
        obs, reward, done, info = env.step(action)
        cumulative_reward += reward 
    return cumulative_reward

Entropy Module File providing implementations for "information-theoretic measure of temporal dependency" (Varley, 2023) -> this complements the Custom "information-theoretic measure of temporal dependency" reward function

import numpy as np 

# Calculate entropy of random variable x
def entropy(x):
    try:
        assert len(x) > 0
        assert len(x) > 0
    except AssertionError:
        raise ValueError("Input array must not be empty.")
    _, counts = np.unique(x, return_counts = True)
    p = counts / len(x)
    return -np.sum(p * np.log2(p))

# Calculate conditional entropy of x given y
def conditional_entropy(x, y):
    try:
        assert len(x) == len(y)
        assert len(x) > 0 
    except AssertionError:
        raise ValueError("Input arrays must have the same length, and must not be empty!")
    n = len(x)
    _, counts = np.unique(y, return_counts = True)
    hy = np.sum(-(counts / n) * np.log2(counts / n))
    hy_x = 0 
    for y_val in np.unique(y):
        x_given_y = x[y == y_val]
        hy_x += (np.sum(y == y_val) / n) * entropy(x_given_y)
    return hy_x, hy

# Calculate mutual information between x and y given condition z
def information_mutual_conditional(x, y, z):
    try:
        assert len(x) == len(y) == len(z)
        assert len(x) > 0
    except AssertionError:
        raise ValueError("Input arrays must have the same length, and must not be empty!")
    hy_xz, hy_z = conditional_entropy(x, z)
    hy_yz, _ = conditional_entropy(y, z)
    hy_z = entropy(z)
    mi = hy_xz + hy_yz - hy_z 
    return mi 

# Calculate mutual information between x and y
def information_mutual(x, y):
    try:
        assert len(x) == len(y)
        assert len(x) > 0
    except AssertionError:
        raise ValueError("Input arrays must have the same length and must not be empty!")
    h_x = entropy(x)
    h_y = entropy(x)
    mi_xy = h_x + h_y - information_mutual_conditional(x, y, np.array([]))
    return mi_xy

I accidentally tried to install twice, it broke and I literally can't

---- warning! ------------------------------------------------------------------
Nix already appears to be installed. This installer may run into issues.
If an error occurs, try manually uninstalling, then rerunning this script.

Uninstalling nix:

  1. Remove macOS-specific components:

    • Uninstall LaunchDaemon org.nixos.nix-daemon

      sudo launchctl bootout system/org.nixos.nix-daemon
      sudo rm /Library/LaunchDaemons/org.nixos.nix-daemon.plist

  2. Restore /etc/bashrc.backup-before-nix back to /etc/bashrc

sudo mv /etc/bashrc.backup-before-nix /etc/bashrc

(after this one, you may need to re-open any terminals that were
opened while it existed.)

  1. Restore /etc/zshrc.backup-before-nix back to /etc/zshrc

sudo mv /etc/zshrc.backup-before-nix /etc/zshrc

(after this one, you may need to re-open any terminals that were
opened while it existed.)

  1. Delete the files Nix added to your system:

sudo rm -rf "/etc/nix" "/nix" "/var/root/.nix-profile" "/var/root/.nix-defexpr" "/var/root/.nix-channels" "/var/root/.local/state/nix" "/var/root/.cache/nix" "/Users/dany/.nix-profile" "/Users/dany/.nix-defexpr" "/Users/dany/.nix-channels" "/Users/dany/.local/state/nix" "/Users/dany/.cache/nix"

and that is it.

---- oh no! --------------------------------------------------------------------
I back up shell profile/rc scripts before I add Nix to them.
I need to back up /etc/bashrc to /etc/bashrc.backup-before-nix,
but the latter already exists.

Here's how to clean up the old backup file:

  1. Back up (copy) /etc/bashrc and /etc/bashrc.backup-before-nix
    to another location, just in case.

  2. Ensure /etc/bashrc.backup-before-nix does not have anything
    Nix-related in it. If it does, something is probably quite
    wrong. Please open an issue or get in touch immediately.

  3. Once you confirm /etc/bashrc is backed up and
    /etc/bashrc.backup-before-nix doesn't mention Nix, run:
    mv /etc/bashrc.backup-before-nix /etc/bashrc

We'd love to help if you need it.

You can open an issue at
https://github.com/NixOS/nix/issues/new?labels=installer&template=installer.md

Or get in touch with the community: https://nixos.org/community

Updated - Debugged CityLearn Gromov Wasserstein Reward Function Env

import torch 
import torch.nn as nn
import torch.nn.functional as F 
import torch.optim as optim

import matplotlib.pyplot as plt 
import numpy as np

import gym 
from citylearn import GridLearn

from entropyUtilities import information_mutual_conditional, information_mutual
from geomloss import SamplesLoss

class customEnv(gym.Env):
    def __init__(self, max_timesteps, n_agents, weather_file, building_attributes_file):
        super().__init__()

        self.timestep = 0
        self.max_timesteps = max_timesteps
        self.n_agents = n_agents

        self.past = np.random.randint(0, 2, size=1)
        self.present = np.random.randint(0, 2, size=1)
        self.future = np.random.randint(0, 2, size=1)

        self.tou_periods = [(0, 8), (8, 16), (16, 24)]
        self.tou_prices = [0.1, 0.2, 0.3]
        self.dr_event_start = 500
        self.dr_event_end = 550
        self.dr_event_percent_reduction = 0.1

        self.grid = GridLearn(weather_file, building_attributes_file)

    def step(self, actions):
        self.past = np.append(self.past, self.present, axis=0)
        self.present = np.append(self.present, self.future, axis=0)
        self.future = np.append(self.future, actions, axis=0)

        tou_period = self.timestep % 24 // 8
        tou_price = self.tou_prices[tou_period]

        if self.dr_event_start <= self.timestep < self.dr_event_end:
            demand_reduction = self.dr_event_percent_reduction
        else:
            demand_reduction = 0

        actions_scaled = actions * self.grid.buildings['Electricity'].peak_power / 2
        self.grid.step(actions_scaled, tou_energy_prices=[tou_price] * 3, demand_response=demand_reduction)

        tau = self.present 
        s = self.past 
        t = self.future 
        I_tau_sx = information_mutual_conditional(s, t, tau)
        I_tau_sx_shared = information_mutual(s, t)
        I_tau_sx_excel = I_tau_sx - I_tau_sx_shared

        action_counts = np.bincount(actions, minlength=2)
        action_probabilities = action_counts / len(actions)
        diversity_penalty = -np.sum(action_probabilities * np.log(action_probabilities))
        reward = -I_tau_sx_excel + diversity_penalty

        done = (self.timestep >= self.max_timesteps)
        obs = self.grid.get_state()[0]['Electricity']['consumption'].flatten()
        return obs, reward, done, {}

    def reset(self):
        self.timestep = 0
        self.past = np.random.randint(0, 2, size=1)
        self.present = np.random.randint(0, 2, size=1)
        self.future = np.random.randint(0, 2, size=1)
        self.grid.reset()

        obs = self.grid.get_state()[0]['Electricity']['consumption'].flatten()
        return obs

def main(max_timesteps, n_agents, weather_file, building_attributes_file):
    env = customEnv(max_timesteps=max_timesteps, n_agents=n_agents,
                    weather_file=weather_file,
                    building_attributes_file=building_attributes_file)

    obs = env.reset()
    done = False
    cumulative_reward = 0

    while not done:
        action = np.random.randint(0, 2, size=n_agents)
        obs, reward, done, info = env.step(action)
        cumulative_reward += reward

    print(f"Total reward earned: {cumulative_reward}")

    return cumulative_reward

I updated the import statement in the customEnv class to import only the required functions from the entropyUtilities module. The line changed from from entropyUtilities import * to from entropyUtilities import information_mutual_conditional, information_mutual. This change assumes that the information_mutual_conditional and information_mutual functions are defined in the entropyUtilities module.

Next, I corrected a typo in the argument name during the instantiation of the customEnv class. The line changed from building_attributes=building_attributes_file to building_attributes_file=building_attributes_file. This ensures that the correct argument name building_attributes_file is used.

Lastly, I modified the stepv2 method in the customEnv class to have the correct method name step. This change avoids having two methods with the same name (step and stepv2), which can cause issues. The line changed from def stepv2(self, actions): to def step(self, actions):.

After making these changes, the code should be ready for debugging and running.

Custom Reward Function using "information-theoretic measure of temporal dependency" (Varley, 2023)

import torch 
import torch.nn as nn 
import torch.nn.functional as F 
import torch.distributions as distributions

import matplotlib.pyplot as plt 
import numpy as np 

import gym
print(gym.__version__)

# Calculate entropy of random variable x
def entropy(x):
    try:
        assert len(x) > 0
        assert len(x) > 0
    except AssertionError:
        raise ValueError("Input array must not be empty.")
    _, counts = np.unique(x, return_counts = True)
    p = counts / len(x)
    return -np.sum(p * np.log2(p))

# Calculate conditional entropy of x given y
def conditional_entropy(x, y):
    try:
        assert len(x) == len(y)
        assert len(x) > 0 
    except AssertionError:
        raise ValueError("Input arrays must have the same length, and must not be empty!")
    n = len(x)
    _, counts = np.unique(y, return_counts = True)
    hy = np.sum(-(counts / n) * np.log2(counts / n))
    hy_x = 0 
    for y_val in np.unique(y):
        x_given_y = x[y == y_val]
        hy_x += (np.sum(y == y_val) / n) * entropy(x_given_y)
    return hy_x, hy

# Calculate mutual information between x and y given condition z
def information_mutual_conditional(x, y, z):
    try:
        assert len(x) == len(y) == len(z)
        assert len(x) > 0
    except AssertionError:
        raise ValueError("Input arrays must have the same length, and must not be empty!")
    hy_xz, hy_z = conditional_entropy(x, z)
    hy_yz, _ = conditional_entropy(y, z)
    hy_z = entropy(z)
    mi = hy_xz + hy_yz - hy_z 
    return mi 

# Calculate mutual information between x and y
def information_mutual(x, y):
    try:
        assert len(x) == len(y)
        assert len(x) > 0
    except AssertionError:
        raise ValueError("Input arrays must have the same length and must not be empty!")
    h_x = entropy(x)
    h_y = entropy(x)
    mi_xy = h_x + h_y - information_mutual_conditional(x, y, np.array([]))
    return mi_xy

class CustomEnv(gym.Env):
    def __init__(self, max_timesteps, n_agents):
        # Define environment params
        self.action_space = gym.spaces.Discrete(2) # binary action space
        self.observation_space = gym.spaces.Box(low = 0, high = 1, 
                                                shape = (3, ), dtype = np.int32)
        self.timestep = 0
        self.max_timesteps = max_timesteps
        self.n_agents = n_agents

        # Initialize the system with random past, present, and future states
        self.past = np.random.randint(0, 2, size = 1)
        self.present = np.random.randint(0, 2, size = 1)
        self.future = np.random.randint(0, 2, size = 1)

    def step(self, actions):
        # update system based on the chosen actions
        self.past = np.random.randint(self.past, self.present, axis = 0)
        self.present = np.append(self.present, self.future, axis = 0)
        self.future = np.append(self.future, actions, axis = 0)
        self.timestep += 1

        # Calculate the Istx measure
        tau = self.present 
        s = self.past 
        x = self.future 
        I_tau_sx = information_mutual_conditional(s, x, tau)
        I_tau_sx_shared = information_mutual(s, x)
        I_tau_sx_excl = I_tau_sx - I_tau_sx_shared

        # Calculate proposed diversity penalty term
        action_counts = np.bincount(actions, minlength = 2)
        action_probabilities = action_counts / len(actions)
        diversity_penalty = -np.sum(action_probabilities * np.log(action_probabilities))

        # Define reward function
        reward = -I_tau_sx_excl + diversity_penalty

        # Determine if the eepisode is done or not
        done = (self.timestep >= self.max_timesteps)
        # return the new state, reward, and done status
        return (self.past[-self.n_agents:], self.present[-self.n_agents:], 
                self.future[-self.n_agents:]), reward, done, {}

    def reset(self):
        # Reset the environment to a new initial state
        self.past  = np.random.randint(0, 2, size = self.n_agents)
        self.present = np.random.randint(0, 2, size = self.n_agents)
        self.future = np.random.randint(0, 2, size = self.n_agents)
        self.timestep = 0
        return (self.past, self.present, self.future)

def main():
    env = CustomEnv(n_agents = 10, max_timesteps = 1000)
    
    INPUT_DIM = env.observation_space.shape[0]
    HIDDEN_DIM = 256
    OUTPUT_DIM = env.action_space.n

    print(INPUT_DIM, HIDDEN_DIM, OUTPUT_DIM)

main()

Streamline onboarding

  • Go through the onboarding instructions (obsidian, justfile, etc) in the README and create a PR addressing any issues or hangups encountered
  • Write a raycast script that will complete all onboarding steps for someone when they type "gm"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.