Giter Site home page Giter Site logo

law-net's Introduction

law-net

What can we learn by applying network and text analysis to the law? This project contains code to analyze legal text and citation networks using data generously provided by CourtListener and the Supreme Court Database.

Some interesting networks include

  • Supreme Court citation network (27,885 nodes, 234,312 directed edges)
  • Federal Appellate circuit (959,985 nodes, 6,649,916 directed edges)
  • any one of the over 400 jurisdiction subnetworks listed on CourtListener

These all have accompanying opinion text files as well as additional node metadata such as the case date and hand coded issue area (for SCOTUS).

We recently gave a presentation about our exploratory analysis at the PyData conference.

PyData Carolinas

Our code

You can load the SCOTUS subnetwork (saved in this directory as a .graphml file)

import igraph as ig
G = ig.Graph.Read_GraphML('scotus_network.graphml')

User beware: we have not yet make the code clean/robust/user friendly/pleasant/etc -- we will get to this soon. If you have trouble with something please reach out to Iain ([email protected]).

To download much more data see download_data.ipynb. This notebook allows you to work with other jurisdiction subnetworks and the opinion text files. Note the two directories you have to change at the top of the notebook.

One of the functions in download_data.ipynb will set up a data directory. I suggest putting data_dir outside your copy of the github repo or Dropbox. Github doesn't like large data files and Dropbox might slow things down if you do a lot of reading and writing (i.e. for some NLP operations).

About the data

Current we are using data from CourtListener (CL) and the Supreme Court Data Base (SCDB)

  • the citation network comes from CL

  • opinion texts come from CL

  • some case metadata (jurisdiction, data, judges) comes from CL

  • additional case meta data comes from SCDB

    • for issueArea we have coded Missing as 0. Only SCOTUS cases can have issueArea.
  • we identify cases by their CourtListener opinion id

    • CL opinion ids and cluster ids are not necessarily the same. One cluster can have many opinions.

Code dependancies

The code is written in Python 2.7. You need

  • Anaconda

  • igraph

  • nltk

    • after installing nltk run the following commands in python

    import nltk

    nltk.download()

Our group

If you are interested in collaborating feel free to reach out to us! This is a collaboration between

Anna Zhao

Bill Shi

Brendan Schneiderman

Ethan Koch

Iain Carmichael

James Jushchuk

James Wudel

Michael Kim

Shankar Bhamidi

law-net's People

Contributors

brschneide3 avatar idc9 avatar jgbw avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.