X Tutup
The Wayback Machine - https://web.archive.org/web/20260308030053/https://github.com/github/semantic/pull/486
Skip to content
This repository was archived by the owner on Apr 1, 2025. It is now read-only.

Kindness is a virtue: changing precise AST kinds to support diffing, comments, partial passes and more!#486

Merged
aymannadeem merged 239 commits intomasterfrom
codegen-cleanup
Jun 11, 2020
Merged

Kindness is a virtue: changing precise AST kinds to support diffing, comments, partial passes and more!#486
aymannadeem merged 239 commits intomasterfrom
codegen-cleanup

Conversation

@aymannadeem
Copy link
Contributor

@aymannadeem aymannadeem commented Feb 13, 2020

This PR introduces refactors that simplify readability, and will also make it easier to transition ASTs to kind (* -> *) -> (* -> *). This entails:

  • generate f :: * -> * parameters for the record types, in TH, and add them to the generated datatype
  • fix the Traversable, Foldable, and Functor instances to account for the new kindedness
  • ensure that the children of each record types are wrapped in an f (except for the ann field)
  • define some Error functor isomorphic to Either String
  • manually define applicative instance for Error type
  • add unit tests
  • use Error type in Unmarshal

Why is this important?

This addresses #480, #481, #482.

TL;DR: it's a byproduct of having strongly-typed trees, where we can't take advantage of historic polymorphic-ness of our nodes and use Error types everywhere.

@patrickt explained this to me using the following example. Consider a data type like this which looks like the ALC syntax:

data Plus a = Plus { ann :: a, left :: Expression a, right :: Expression a }

That type is polymorphic in its annotation type a, but its children are monomorphic. They are always Expression values. And if we get a parse error from tree-sitter, we can’t construct an Expression. Under a-la-carte syntax, all the children of a node were polymorphic, so we were able just to stick an Error node in there. But our types are stronger now, so we can’t get away with just stuffing Errors everywhere. Instead, what we’re gonna generate is this:

data Plus f a = Plus { ann :: a, left :: f (Expression f a), right :: f (Expression f a) }

In this new precise world, Plus has kind (* -> *) -> * -> * where the first parameter f is a “shape” functor that dictates what possible shapes the children could have. If we start substituting values for f, we get different types of trees. For example,Plus Identity means that left :: Identity (Expression Identity a), right :: Identity (Expression Identity a).

Since Identity a is isomorphic to a, that means Plus Identity is isomorphic to our old definition of Plus.

But what happens when we start getting some more interesting functors? Let’s consider Plus (Either String). Substituting in Either String for f, that leaves us with children that look like left :: Either String (Expression (Either String) a), right :: Either String (Expression (Either String) a) and that lets us handle parse errors. The left field is Right if the parse succeeded and Left if it failed. This means we can slurp a partial or erroneous tree via Unmarshal.

This PR also has a few additional changes that are:

  • Cosmetic! 💄 Better readability, reducing repetition and using more expressive functions.
  • Standardizing: use the TH function convention instead of data constructors (ie., conT instead of ConT); introducing constructor names in signatures that otherwise had ambiguous types (ex., DatatypeName instead of String).

@aymannadeem aymannadeem changed the title Codegen cleanup Kindness is a virtue: changing precise AST kinds to support diffing, comments, partial passes and more! Feb 20, 2020
@aymannadeem aymannadeem requested review from patrickt and robrix and removed request for robrix June 8, 2020 16:25
@aymannadeem
Copy link
Contributor Author

Update: none of the kind adjustments happened due to the new shape being in conflict with Traversable1. However, this PR accomplishes adding an error functor Err.

@aymannadeem aymannadeem requested a review from a team June 8, 2020 23:50
Copy link
Contributor

@patrickt patrickt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of things to iron out here, but on the whole this is looking really good! Huge kudos to you, @aymannadeem; I know that this was a Herculean level of effort. Once the various pragmas/pattern matches are ironed out, we’ll figure out when and where to merge this!

Comment on lines +65 to +66
# cabal v2-run --project-file=cabal.project.ci semantic-python:test:compiling
# cabal v2-run --project-file=cabal.project.ci semantic-python:test:graphing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consult with @joshvera and @BekaValentine about how/when we want to merge the efforts in this branch with the scope-graphing ones. We could make this branch based off of the stack graph one, or vice versa; either way, I’m happy to help with the plug-and-chug addition of Success branches.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to get this in and deal with the merge on the stack graph branch later.


instance Applicative Err where
pure = Success
Fail e <*> _ = Fail e
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth noting that this instance prefers the error message on the LHS of the <*> operator. Alternatively, we could do

data Err a = Fail (NonEmpty String) | Success a

and say that the Applicative instance over Fail concatenates the reasons for failure.

Language.Python
Language.Python.AST
Language.Python.Core
-- Language.Python.Core
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we file a bug to reenable the Core stuff (or elide it entirely for the time being)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docComment
src
( Parse.Success
( R1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this combination of Success and R1 can be expressed with EPrj (and further on with that L1 pattern match).

@aymannadeem aymannadeem mentioned this pull request Jun 11, 2020
@aymannadeem aymannadeem requested a review from patrickt June 11, 2020 17:01
Copy link
Contributor

@patrickt patrickt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let’s go!!!

@aymannadeem aymannadeem merged commit 621696f into master Jun 11, 2020
@aymannadeem aymannadeem deleted the codegen-cleanup branch June 11, 2020 17:16
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

X Tutup