Giter Site home page Giter Site logo

crosswind's Introduction

crosswind

About

Crosswind is a tool and library that is looking to enable easy conversion of a codebase from Python 2 to 3 and back. The intention is to enable teams to write modern Python 3 even if they have to support older versions of their software still using Python 2. Crosswind also aims to provide a means to write arbitrary large-scale refactoring that can be applied to a codebase in much the same way as the 2to3 or 3to2 migration.

Crosswind is a fork of the lib3to2 project that can be found on Bitbucket and PyPi. The source history has been ported from Mercurial to Git using hg-fast-export.

lib2to3 was also ported over from the Python project.

Objectives/Goals

The following are the overarching objectives of crosswind. They are subject to change as a game plan is established. For now, they are:

  • Support only Python 2.7 for backporting 3 to 2
  • Support upgrading to 3.6 and above whem migrating 2 to 3
  • Running all tooling is only supported on 3.6 and up
  • Reach and maintain parity between the two_to_three and three_to_two fixer suites
  • Consistently pull in upstream updates for both lib2to3 and lib3to2 when it aligns with the above goals
  • Investigate/resolve open issues on lib3to2

Contributing

If you're looking for ways to contribute there are a few options to choose from:

  • Helping write wiki pages with more knowledge of all facets of writing fixers or tests or understanding the pgen/lib2to3 libraries/APIs
  • Try out Crosswind on your codebase and open issues for things you suspect aren't coming out quite right (please include code samples!)
  • Submit a pull request with a fix or enhancement (it would be ideal to open an issue first for discussion)

You can start by taking a look at the Writing a Fixer wiki page to learn about what goes into the fixer and its tests.

Development

Crosswind uses the Poetry tool for managing dependencies and virtual environments. Common development tasks have aliases that have been collected in a Makefile at the root of the project.

To get started with crosswind development, run make install. This will use Poetry to create the virtualenv and install the dependencies (both runtime and dev-time). Note that you need your currently active python to be compatible with the versions in the pyproject.toml file. If that isn't the case, you'll have to manage the virtualenv creation yourself. For example, by executing the following in your checkout of Crosswind:

python3 -m virtualenv .venv

If you don't yet have virtualenv available you can get it via pip: python3 -m pip install virtualenv. Once you have the virtualenv created you will need to activate it before continuing.

WIth the virtualenv created you can then proceed to make install to have Poetry download and install dependencies into your activated venv.

Then you can lint the code and run the tests with make lint and make tests or combine them (since they're separate targets and make allows specifying many targets in a single invocation) with make lint tests and so on.

Currently, running crosswind requires a manual invocation of poetry such as:

poetry run python crosswind/crosswind --help

This will show the help output containing the currently supported options. It should be noted that the current state of the crosswind tool is that it only has access to the 2to3 fixers. Further efforts are needed to allow it to combine arbitrary fixers and fixer suites.

Configuration

In addition to passing flags at the command line (issue --help to crosswind for available flags) you can configure crosswind by defining such configuration in pyproject.toml under a tool.crosswind heading such as:

[tool.crosswind]
output_dir = "path/to/output"

Additionally, presets can be defined by adding .preset.<name> as an additional heading like:

[tool.crosswind.preset.foo]
output_dir = "foo/specific/path"

If a preset is specified it will exclusively use that preset's configuration and will not merge it with the default configuration. Command line flags are still merged into the preset configuration however.

Thanks/Inspiration

Thanks to Joseph Amenta for developing lib3to2 and making it open source. This effort wouldn't be possible without it!

Also a huge thank you to everyone who has contribued to Python, pgen, and lib2to3. Having such a foundation to start from has made things immeasurably easier and far more productive!

And finally a shoutout to contributors of both futurize and drop2 which together served as the basis of the defuturize fixer suite.

crosswind's People

Contributors

airbreather avatar blink1073 avatar jakethagle avatar locutusofborg avatar myint avatar ryanwersal avatar xoviat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Forkers

way-how

crosswind's Issues

Add means of validating resulting code

It seems ideal to add a --validate flag or similar to allow users of crosswind to validate the resulting code. If we are looking to do this for both 2 and 3 we'll likely need to use 2to3's AST but if we are okay with only supporting validation on the 3 side then we could look at LibCST.

Remove feature_base from 3to2

We're dropping support for Python versions below 2.7 so it is no longer needed. It appears to have been used to provide warnings for features that couldn't be converted.

Add itertools_imports fixer to 3to2

There is a fixer for these in 2to3 so we should match it with one in 3to2 since, best I can tell from docs, it doesn't exist in parity in Python 2 and 3.

Default to only running fixer_suites.two_to_three

I ran crosswind with no arguments and the output looked odd - it had used all of the fixer suites (including three_to_two) which is a surprising and not correct result. Need to choose a reasonable default and two_to_three seems most reasonable? Or requiring providing fixer suites to run? Should be less cumbersome to do once we're on argparse or other CLI library.

Handle Name on both sides of inequality in fix_nonetype_inequality

Currently the fixer-as-written will use the first "name" leaf it encounters as the value that could be None. I'm not sure what the correct pattern should be for a conditional like:

if foo < bar:

Right now it turns into:

if foo is None or foo < bar:

But that isn't pessimistic enough about bar - but what does the correct code look like here?

Add fullargspec fixer to 2to3

It isn't yet clear to me why this exists in the 3to2 fixer suite but it does and it strikes me as something we should mirror in the 2to3 fixer suite.

Initial efforts to make lib3to2 use the latest lib2to3 library

Currently the 3to2 portion is still a fork from a much older lib2to3. There are several comments throughout the source indicating that code was copied and minimally changed or, seemingly more typically, was copied out of convenience.

It seems likely (and desirable) to start work on updating the lib3to2 side to use the crosswind.lib2to3 code as much as possible.

Figure out PATTERN

For this project to be successful we need to make it as easy as possible to create and contribute correct fixers for either direction. To do this, we need to find (or make) tooling and documentation. See the Raw PATTERN Notes wiki for a place to accumulate notes. Eventually we'll turn that into more proper documentation.

Add support for multiple fixer bundles

Currently lib2to3 and lib3to2 behave as separate libraries - while I want to merge them together at some point (effectively bringing the forked 3to2 more in line with the much more modern 2to3 code from Python 3.8 RC1) they'll still likely remain as separate fixer bundles.

As it stands, the fixer bundle (or fixer_pkg as referred to in current code) is "hardcoded" in the cli entrypoint. We'll need to add the ability to define many packages (since we'll presumably always have 2to3 and 3to2 but may also have packages to remove the compatibility stuff from futurize runs etc).

That probably means we need something like:

  • Way to list the bundles in a similar manner to how fixers are listed
  • Way to define which bundles should be executed (same as fixers) since they likely won't always be mutually exclusive

Relevant code would be in crosswind/crosswind and the __main__.py and main.py files in both lib2to3 and lib3to2 folders.

Intergate find_pattern

Relates to #3.

It was discovered that there is a script in some dumps of the python source (see the wiki) but it doesn't appear to be in the lib2to3 of HEAD cpython.

Figure out why and integrate it in (or something similar to it).

Add urllib fixer to 3to2

2to3 has a fixer shuffling several imports around due to the module split in Python 3. Best I can tell we will want a 3to2 fixer to reverse those changes.

Add xreadlines fixer for 3to2

Need to create a fixer to reverse 2to3's xreadlines changes (and/or confirm that the code is functional for both Python 3.6+ and Python 2.7).

Add collections fixer for two_to_three

It currently doesn't appear that 2to3 has a fixer similar to fix_collections in 3to2. It looks like it would be desirable to upgrade such code during the migration to 3.

Add tests to confirm py2 and py3 behaviors

This is a little strange since it'll mean having a separate "project" (since it needs to run under py2 etc) but it seems worth it to have it as part of our check pipeline.

Add wiki page comparing the fixers in 2to3 and 3to2

Ideally we would be able to "roundtrip" code through 2to3 and back with 3to2 and result in near identical output. It is a lofty and likely unobtainable goal but we can start with creating some wiki pages and/or other forms of documentation to document the fixers we do have.

Presuming we find differences we'll be able to use them to finish bulking out the two fix suites before we start looking at other bugs/features to do.

Add super fixer to 2to3

super calls are simplified in Python 3 and the 3to2 fixer suite has a fixer to "downgrade" it back to Python 2's required arguments. It would be ideal to have a fixer to simplify these calls during upgrade if possible.

Remove with fixer from 3to2

The __future__ with statement became canonical in 2.6 so we can drop this fixer as we're only supporting 2.7.

Add Makefile for source tasks

As poetry lacks support for script aliases we'll have to go another route. Sounds like make is agreeable enough so we should look at getting a Makefile added containing at least the following tasks/targets:

  • linting
  • running tests
  • formatting the code

Add reload fixer to 3to2

importlib.reload does not exist in Python 2.7 so we'll need to match the fix_reload fixer in 2to3 with one in 3to2.

Convert all tests from unittest to pytest

This is done for a few reasons:

  • The tests should be easier to maintain since pytest has less ceremony
  • Several tests rely on introspecting stdout/stderr which pytests supports (since current tests have to manually capture those streams)
  • Should be easier to add more tests going forward (less setup/teardown and no need to add full classes etc)

This should be considered blocked by #14

Add "defuturize" fixer suite

Need to figure out exactly what this entails but the general premise is that futurize's approach of simultaneously support 2 and 3 has been great for a hybrid Python 2 and 3 world but seems less ideal with Python 2 being end of lifed. As a result, we should start working on figuring out what can be undone in a futurized project. A few very off hand thoughts:

  • Can past and future imports imply be removed?
  • What other changes are required to make the code "correct"? Simply run two_to_three and defuturize fixer suites together?
  • Should defuturize be created like lib3to2 was? My understanding is it forked lib2to3 and then swapped each fixer's PATTERN and transform method implementations around to reverse the process.

Opening this issue to hopefully generate some discussion of what work is entailed in this.

Add code formatter

Sounds like black is the current front runner.

Doing this requires us to effectively "hard fork" (in so much as we can no longer easily pull upstream down) but I am perfectly okay with that given that we should eventually make the lib3to2 portion use the latest lib2to3 infrastructure.

Create new "pessimist" fixer suite

In attempting to use crosswind on a couple code bases it is evident that we'll need another fixer suite that I'm naming "pessimist" since it will likely involve overly protective (aka pessimistic) fixers.

test_main.py has several assertions commented out

This is due to the capturing of stdout/stderr with pytest (which is good!). We should look at using this ticket to do some amount of #15 and get test_main.py updated so we can use proper capsys calls instead of the existing janky capture of the out/err streams.

Add standarderror fixer in 3to2

StandardError and Exception aren't the same thing so we will need to add a fixer for it in 3to2 to mirror the 2to3 fixer.

Figure out packaging for crosswind

#11 will create our pipeline. This ticket will help us decide between options on how to package, test the package(s), and how to deploy it to pypi.

A few options for packaging:

  • Source only wheel
  • Compiled wheels (Nuitka? Benefits seem a bit dubious but presented for completeness.)
  • Source only package

I'm not sure how best to test the packaging (mostly because I'm unsure how to get the built package prior to submission and ensure it matches our expectations). It would be ideal if we are able to have some acceptance tests that can run once the package is created and installed on a build node (does the cli run, do trivial examples work, does the tooling function in certain scenarios).

And finally we'll need to determine how to get this all submitted to pypi.

Add dict fixer for 3to2

There are a lot of changes made when going through 2to3 with regards to dictionary usage. We should add a fixer to reverse those changes when going down to 2. Note that this doesn't need perfect parity - we can look to either:

  • Update the names to align the behaviors between 2 and 3
  • Or just add some defensive wrapping (similar to leaving in the extra list calls in various fixers)

Add toml support

The code formatter we use, black, supports configuration in a toml file. This allows it to have its configuration in a single file instead of endlessly proliferating dotfiles etc. We should follow their lead and use config key names that are the long command line forms.

Determine what versions of Python to support

Currently the README indicates that only 2.7 is supported on the 2.x side of things and that we support 3.5+ on the 3.x side. Updates to other Python libraries seem to be dropping support for 3.5 so I'm inclined to think we should as well. Our personal requirements still mandate support for 3.6 but it seems doing 3.6+ is reasonable.

Thoughts?

Create build matrix for supported python versions

Would be ideal to run all our build checks against Python versions 3.6, 3.7, and the newly released 3.8. We should probably also look at running at least one version on Windows and macOS if not all of them.

Remove 3to2 bitlength fixer

We're only supporting 2.7 on the Python 2 side and it looks to support the same notion of bitlength as Python 3 which renders this fixer superfluous.

Update fix_set_literal for use when list is used

Two-to-three's fix_set_literal handles cases such as:

set([1, 2, 3])

and updates it to:

{1, 2, 3}

It does not handle cases where the list already exists:

x = [1, 2, 3]
set(x)

This does not get converted.

It seems it may be desirable to update it to expand via set literal?

x = [1, 2, 3]
{*x}

Create initial CI/CD pipeline using GitHub Actions

I've enabled GH Actions - the relevant files are in .github/workflows. We need to start looking at the CI/CD pipeline. For now, I think that would consist of:

  1. Run all tests using pytest
  2. Confirm code is formatted correctly according to the black tool

I'm inclined to not worry about packaging as of yet since we'll want to decide what that looks like (source only wheel, compiled wheels of some kind, source only package, etc).

Also #9 is likely related since that'll allow us to give aliases to these tasks and use them within the build scripts.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.