This is an ongoing project as part of my PhD research, where I am implementing Reinforcement Learning agents for selecting low-level heuristics applied on combinatorial optimization problems.
- Deep Q-Network
- Dynamic Multi-Armed Bandit
- Fitness-Rate-Rank Multi Armed Bandit
The domains come from the HyFlex Framework:
- Bin Packing
- MAX-SAT
- Personnel Scheduling
- Flow Shop
- Traveling Salesman
- Vehicle Routing