Giter Site home page Giter Site logo

symbol-analyzer's Introduction

symbol-analyzer Build Status

symbol-analyzer is a code analyzer that analyzes for you how each symbol is being used in the code. It can be used in various ways such as static analysis for Clojure code or defining complicated macros that require code walk.

Installation

Latest stable release is 0.1.1.

Add the following dependency to your project.clj file:

Clojars Project

Basic Usage

The basic usages of symbol-analyzer are extract and analyze.

Note symbol-analyzer is still of alpha quality, and its APIs (the format of their return values described below, especially) are highly subject to change.

Extract

Extract analyzes how the specified symbols in the code are being used and will return the result, which we call symbol information. The target symbols to be analyzed are specified by assigning unique IDs as metadata with a specific key. The default key is :id.

For example, we can analyze the usage of the second x in (let [x 0] x) as follows:

user=> (require '[symbol-analyzer.extraction :refer [extract]])
nil
user=> (extract '(let [x 0] ^{:id 0} x))
{0 {:type :local, :usage :ref, :binding :none}}
user=>

In this example, we assign ID 0 to the second x we are targeting. From the result, we can find the symbol assigned ID 0 (i.e. the one we are targeting) to be a reference to a local binding. Similarly, we'll get the following result if we assign IDs to other symbols as well:

user=> (extract '(^{:id 0} let [^{:id 1} x 0] ^{:id 2} x))
{2 {:type :local, :usage :ref, :binding 1}, 1 {:type :local, :usage :def}, 0 {:type :macro, :macro #'clojure.core/let}}
user=>

symbol-analyzer can even analyze code containing user-defined macros; the analyzer expands the macro by itself if it encounters a macro in the course of analysis, and it will identify the usage of symbols from the expanded form that doesn't contain macros.

user=> (defmacro let1 [name expr & body] `(let [~name ~expr] ~@body))
#'user/let1
user=> (let1 x 2 (* x x))
4
user=> (extract '(let1 ^{:id 0} x 2 (* ^{:id 1} x x)))
{1 {:type :local, :usage :ref, :binding 0}, 0 {:type :local, :usage :def}}
user=>

Analyze

Analyze applies extract to all the symbols in the code. Symbol information resulted from the extraction will be added to symbols in the input code as metadata.

user=> (require '[symbol-analyzer.core :refer [analyze-sexp]])
nil
user=> (set! *print-meta* true)    ; to visualize metadata
nil
user=> (analyze-sexp '(let [x 0] x))
(^{:symbol-info {:type :macro, :macro #'clojure.core/let}, :id 7} let
 [^{:symbol-info {:type :local, :usage :def}, :id 8} x 0]
 ^{:symbol-info {:type :local, :usage :ref, :binding 8}, :id 9} x)
user=>

Using the analysis results, we can also write certain types of code walkers rather easily. For example, suppose we want to do something to all (and only) the symbols representing local bindings. In such a case, it is often the case we have to expend a great amount of effort to write up the code handling local environments and traversing the input code by ourselves. With symbol-analyzer, on the other hand, we can realize code walkers like that in combination with simple sequence functions such as reduce and filter or functions in clojure.walk. In the example below, we are defining with analyze-sexp a function renaming the symbols representing local bindings:

user=> (defn rename-locals [sexp]
  #_=>   (postwalk (fn [x]
  #_=>               (if (and (symbol? x)
  #_=>                        (= (-> x meta :symbol-info :type) :local))
  #_=>                 (symbol (str \? x))
  #_=>                 x))
  #_=>             (analyze-sexp sexp)))
#'user/rename-locals
user=> (rename-locals '(let [x x] [x 'x]))
(let [?x x] [?x (quote x)])
user=>

Note that the symbols representing the local x bound by let get renamed to ?x, and not the symbols representing x as a free variable nor a quoted literal symbol.

In addition to analyze-sexp, taking an S-expression as input, symbol-analyzer also provides another API named analyze, which takes as input Clojure code represented as Sjacket nodes. This interface is intended to use for implementing tools analyzing the code and writing it back as text, such as syntax highlighters, etc. (See genuine-highlighter for an example of symbol-analyzer-backed syntax highlighter).

License

Copyright © 2014-2015 OHTA Shogo

Distributed under the Eclipse Public License version 1.0.

symbol-analyzer's People

Contributors

athos avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

symbol-analyzer's Issues

making compatible with Sjacket parse tree

The analyzer now takes as its input a customized format of Sjacket parse trees. By that customization, the analyzer assumes that the root node has :root tag (not :net.cgrand.sjacket.parser/root as Sjacket uses) and that symbol nodes are always id-annotated.

To make the analyzer accept 100% Sjacket-compatible parse trees as its input, the following tasks should be done:

  • to make the converter able to handle root nodes with Sjacket-compatible tag
  • to annotate every symbol nodes with id before beginning analysis

more details in README

Especially, the following content must be included:

  • installation
  • basic usage
    • extract
    • analyze

extractor omits symbol information taken from metadata

The current extractor omits symbol information taken from metadata.

An example is shown below:

user=> (def s "foo")
#'user/s
user=> (pprint (analyze (p/parser "^String s")))
{:tag :net.cgrand.sjacket.parser/root,
 :content
 [{:tag :meta,
   :content
   ["^"
    {:tag :symbol, :content [{:tag :name, :content ["String"]}], :id 9} ; <- this symbol node should have symbol information
    {:tag :whitespace, :content [" "]}
    {:tag :symbol,
     :content [{:tag :name, :content ["s"]}],
     :symbol-info
     {:type :var, :usage :ref, :var #<Var@11ae5c1: "foo">},
     :id 10}]}]}
nil
user=>

In order to improve this defect, we need to modify extraction functions for each special form and for sequence forms so that they handle metadata properly.

more examples needed

Just README and docstrings are not enough to understand how to use the library.

prepare 0.1.0 release

Before releasing 0.1.0, the following items must be done:

  • update the version of the library in project.clj
  • add a description about installation to README
  • generate pom.xml

extractor doesn't reflect semantics of .(dot) correctly

imported from athos/genuine-highlighter#11

On the current Clojure implementation, local bindings are ignored if the first argument of .(dot) special form looks like class reference.

For example, the following is regarded as a valid expression:

(let [Integer "foo"]
  (. Integer valueOf 1)) ;=> 1

According to this result, the second occurence of Integer must be interpreted as class reference while the extractor currently interprets it as local binding reference.

(extract '(let [Integer "foo"] (. ^{::c/id 0}Integer valueOf 1))
;=> {0 {:type :local, :usage :ref, :binding {:type :local, :usage :def}}}

The extractor should be modified so that the dot semantics mentioned above would be reflected.

more informative analysis result for symbols in syntax-quote forms

The current implementation of the analyzer always identifies symbols in syntax-quote forms just as quoted.

For example,

user=> (pprint (analyze (p/parser "`(let [x 0] x)")))
{:tag :net.cgrand.sjacket.parser/root,
 :content
 [{:tag :syntax-quote,
   :content
   ["`"
    {:tag :list,
     :content
     ["("
      {:tag :symbol,
       :content [{:tag :name, :content ["let"]}],
       :symbol-info-key {:type :quote},
       :id 7}
      {:tag :whitespace, :content [" "]}
      {:tag :vector,
       :content
       ["["
        {:tag :symbol,
         :content [{:tag :name, :content ["x"]}],
         :symbol-info-key {:type :quote},
         :id 8}
        {:tag :whitespace, :content [" "]}
        {:tag :number, :content ["0"]}
        "]"]}
      {:tag :whitespace, :content [" "]}
      {:tag :symbol,
       :content [{:tag :name, :content ["x"]}],
       :symbol-info-key {:type :quote},
       :id 9}
      ")"]}]}]}
nil
user=>

In fact, the result is correct, but in some cases it'll be more helpful if the analyzer can identify let to be :macro, and x to be :local. To this end, probably we should not only modify the extractor, but also make the converter colaborate with it.

converter omits symbol IDs in syntax-quote form

During conversion, the converter omits the symbol IDs occuring in syntax-quote forms. As a result, the analyzer can't recognize symbols in syntax-quote.

Here is an example:

user=> (pprint (analyze (p/parser "`cons")))
{:tag :net.cgrand.sjacket.parser/root,
 :content
 [{:tag :syntax-quote,
   :content
   ["`"
    {:tag :symbol,
     :content [{:tag :name, :content ["cons"]}],
     :id 7}]}]}
nil
user=>

Actually, the symbol cons should be resolved to symbol type :quote or something.

This is caused by that the converter generates a lot of new symbols (for ns-qualified symbols, gensyms, etc.), though the metadata the old ones originally have won't be attached to new ones.

resolve symbol role conflict

imported from athos/genuine-highlighter#6

Each symbol has roles among special, macro, var, local or symbol (for now). And we now temporarily assume that every symbol has only a single role.
Taking into account more complicated situations (e.g. where a symbol would be put into multiple places through macro expansion), we should allow symbols to take more than a role. To this end, the extractor must be equipped with a kind of conflict resolusion mechanism.

user-customizable marking key

The extractor accepts a fixed metadata key (:symbol-highlighter.conversion/id) as mark for users to indicate which symbol's information should be extracted. Although being a namespace-qualified keyword is necessary to avoid incidental crash with keys for other modules, it might be tedious to annotate that long keyword to every symbol to be extracted.

So, for the sake of convenience, the extractor should be modified so that it can take another optional argument for users to specify the marking key.

extractor ignores operator symbols in Java interop forms

The extractor now silently ignores symbols appearing in some kinds of Java interop forms as operator.

user=> (pprint (analyze (p/parser "(String. \"foo\")")))
{:tag :net.cgrand.sjacket.parser/root,
 :content
 [{:tag :list,
   :content
   ["("
    {:tag :symbol,
     :content [{:tag :name, :content ["String."]}],
     :id 13}
    {:tag :whitespace, :content [" "]}
    {:tag :string, :content ["\"" "foo" "\""]}
    ")"]}]}
nil
user=> (pprint (analyze (p/parser "(.getName Class)")))
{:tag :net.cgrand.sjacket.parser/root,
 :content
 [{:tag :list,
   :content
   ["("
    {:tag :symbol,
     :content [{:tag :name, :content [".getName"]}],
     :id 11}
    {:tag :whitespace, :content [" "]}
    {:tag :symbol,
     :content [{:tag :name, :content ["Class"]}],
     :symbol-info {:type :class, :class java.lang.Class},
     :id 12}
    ")"]}]}
nil
user=> (pprint (analyze (p/parser "(Integer/parseInt \"1\")")))
{:tag :net.cgrand.sjacket.parser/root,
 :content
 [{:tag :list,
   :content
   ["("
    {:tag :symbol,
     :content
     [{:tag :ns, :content ["Integer"]}
      "/"
      {:tag :name, :content ["parseInt"]}],
     :id 14}
    {:tag :whitespace, :content [" "]}
    {:tag :string, :content ["\"" "1" "\""]}
    ")"]}]}
nil
user=>

The shorthand forms for Java interop are expanded at macro expansion time, so in order to track these symbols we should extract their symbol information before macro expansion.

extraction from deftype form fails

imported from athos/genuine-highlighter#12

According to the error message, deftype macro tries to import the defined class at macro expansion time, and fails to find it out because extraction doesn't evaluate any forms at all.

At the moment, I have not hit on an idea on workaround. At worst, we have to restrict the usage of extraction.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.