When scaling Fixit to large monolithic repos we have run into challenges with the config file. In monorepos it is important to be able to do directory level configurations, that way users can tweak rules for just their projects without affecting the entire repository.
I wanted to kick off a discussion around this problem first by getting an agreement on the requirements for fixit’s configs. Here is the problem space I wish to address:
Problem: I want to turn on (or off) a lint rule just for my directory
In Fixit today there is only one top-level config, and only one way to turn off rules for the repo: block_list_rules.
This means we can turn off rules for everyone or no one, with no subdirectory level support.
Problem: I want to customize a rule_config just for my directory
Same as the issue above, since there is only one root-level config file there is no way to customize rules for subdirectories.
Proposals
Inheritance of config files
The problems listed can be mostly addressed by supporting subdirectories with overriding configs. This way a top-level config file can exist that contains the defaults the majority of the repo will utilize, but subdirectories can override these defaults with their configurations.
Based on our experiences with Flake8, I propose these files support inheritance. When there is no config inheritance, users usually make copies of the top-level config to keep most of the defaults and only change the few they need. Unfortunately, these copies are not linked with the top-level config and get out of sync very quickly. Worse still, many users assume inheritance is a property already of the config files and do not copy the top-level configs at all, losing valuable and sometimes necessary customizations.
Enabling inheritance of config files will enable custom configurations per-directory, while keeping in sync with the top-level config and its defaults.
Inheritance opens up a lot of questions, which I have listed in the next section below.
Create an allow_list_rules
key
To fully address the first problem I propose we add a new key, allow_list_rules
.
Adding an allow_list_rules
section to the config will enable us to mark rules as ‘on’, instead of defaulting to everything running. Using this, directories can turn on rules they want without the rule also being turned on everywhere else.
This will also make it less risky to upgrade to a new version of Fixit. New rules can be added to Fixit, but they won’t start firing until we explicitly turn them on. This gives us a chance to test and vet new rules.
block_list_rules
should take precedence. This way inherited rules that are on can be overridden and turned off, as well as it enables the root config to turn off dangerous rules for all leaf configs in a single location.
Inheritance Questions
Assuming people are on board with the idea of inheritance, there are several questions that should be discussed.
Should leaf configs override the root, or merge with the root config?
Example:
root/.fixit.config.yaml
block_list_rules:
- BlockListedRule
root/subdir/.fixit.config.yaml
block_list_rules:
- OtherBlockListedRule
Overridden the final file looks like...
block_list_rules:
- OtherBlockListedRule
Merging/appending instead of overriding becomes...
block_list_rules:
- OtherBlockListedRule
- BlockListedRule
I think the answer here is merging should be the default, at least for keys like block_list_rules
. This way the root config has the power to en-mass turn off rules, which could be valuable if a rule is found to be dangerous.
Let’s check again with a more complicated example showing a deep merge in the rule_config
.
root/.fixit.config.yaml
block_list_patterns:
- '@generated'
- '@nolint'
block_list_rules:
- BlockListedRule
fixture_dir: ./tests/fixtures
formatter:
- black
- '-'
packages:
- fixit.rules
repo_root: .
rule_config:
ImportConstraintsRule:
fixit:
rules: [["*", "allow"]]
root/subdir/.fixit.config.yaml
block_list_rules:
- OtherBlockListedRule
packages:
- myproject.rules
rule_config:
ImportConstraintsRule:
mysubdir:
rules: [
["fixit", "allow"],
["other_module", "deny"],
["*", "deny"]
]
ignore_tests: True
ignore_types: True
message: "'{imported}' cannot be imported from within '{current_file}'."
Merged config
block_list_patterns:
- '@generated'
- '@nolint'
block_list_rules:
- BlockListedRule
- OtherBlockListedRule
fixture_dir: ./tests/fixtures
formatter:
- black
- '-'
packages:
- fixit.rules
- myproject.rules
repo_root: .
rule_config:
ImportConstraintsRule:
fixit:
- - '*'
- allow
mysubdir:
rules:
- - fixit
- allow
- - other_module
- deny
- - '*'
- deny
ignore_tests: true
ignore_types: true
message: '''{imported}'' cannot be imported from within ''{current_file}''.'
This again looks right to me. We’ve added some custom subdir configurations to the ImportConstraintsRule
, and we’ve extended the packages and block_list_rules
.
Should leaf configs always deep merge with the root config?
Assuming people agree with my assessment that the above config looks good, does it make sense for it to always deep merge? Let’s look at an example where keys in both the root and the leaf are identical:
root/.fixit.config.yaml
rule_config:
ImportConstraintsRule:
fixit:
rules: [["*", "allow"]]
root/subdir/.fixit.config.yaml
rule_config:
ImportConstraintsRule:
fixit:
rules: [
["fixit", "allow"],
["other_module", "deny"],
["*", "deny"]
]
ignore_tests: True
ignore_types: True
message: "'{imported}' cannot be imported from within '{current_file}'."
Merged file
rule_config:
ImportConstraintsRule:
fixit:
rules:
- - '*'
- allow
- - fixit
- allow
- - other_module
- deny
- - '*'
- deny
ignore_tests: true
ignore_types: true
message: '''{imported}'' cannot be imported from within ''{current_file}''.'
The rule_config
’s rules
section has been appended, but this results in an invalid rule: there are two *
rules which will raise an exception. Here the deep merge technique failed.
What if we override duplicate keys in the rule_config
section, and merge the rest?
This is possible to do. Overriding the rule_config
keys and merging everywhere else would result in:
block_list_patterns:
- '@generated'
- '@nolint'
block_list_rules:
- BlockListedRule
- OtherBlockListedRule
fixture_dir: ./tests/fixtures
formatter:
- black
- '-'
packages:
- fixit.rules
- myproject.rules
repo_root: .
rule_config:
ImportConstraintsRule:
fixit:
rules: [
["fixit", "allow"],
["other_module", "deny"],
["*", "deny"]
]
ignore_tests: True
ignore_types: True
message: "'{imported}' cannot be imported from within '{current_file}'."
Which is a correct config file. This is not prohibitively complicated to implement, we can choose which keys we want to be merged and which we want to be overridden with a tool like jsonmerge.
This strategy works well for the few rules I have tested it out on. My only concern is the variety of rules. I wonder if some rules would be better configured with a deep merge, instead of overriding. I’d appreciate any thoughts around this. It is possible to require each rule author to define whether they want their rule configured via a deep merge or overriding, but that is adding a layer of complexity.
Config resolution order?
How do we want to define the order of config inheritance? There’s several questions here:
- Should configs be limited to inheriting only from the root config?
- Should configs be able to inherit from configs outside of their directory tree?
- If so, how does the user define this (a new key in the config?)
- How do we avoid the diamond problem and determine order?
I think the most appealing way is to inherit via the directory hierarchy. We can read yaml files starting from the root (where default values would be) to the leaf (where most specific values would be, which should take precedence).
Like so:
root
├──root_config.yaml
└──myproject
├──sub_config.yaml
└──subdir
└── leaf_config.yaml
Here leaf_config.yaml
inherits from sub_config.yaml
and root_config.yaml
. sub_config.yaml
inherits just from root_config.yaml
. The user isn’t be required to do any extra configurations to indicate from where they wish to inherit, it happens automatically. I think this is intuitive to users, and keeps the implementation and maintenance simple.
Embedding partial configs?
This idea adds some complexity, but it is worth discussing. We could use interpolation or includes to embed whole configs in parts of another config. This might be interesting for individual rules in the rule_config
section.
Instead of defining default rule customizations in the root config, rule authors could generate a config just for their rule, and embed that into the root config. Something like this:
rule_config: !include root/rules/**/*.yaml
That would embed all the configs defined under rules.
This adds the overhead that rule authors would need to determine “default” rule configurations, and would result in the creation of more config files. You also would not be able to see the configuration of all the rules at a glance. But it would keep the root config a lot smaller and more manageable, and make it relatively easy to find and update rule_configs
specific to individual rules.
Thanks for reading this far! This is a brain dump of my thoughts, please comment on anything you disagree (or agree) with, or anything I have not thought of.