anidata / ht-archive Goto Github PK
View Code? Open in Web Editor NEWAnidata 1.0: Frontend node service for data collected from palantiri
License: BSD 2-Clause "Simplified" License
Anidata 1.0: Frontend node service for data collected from palantiri
License: BSD 2-Clause "Simplified" License
User story: As a prosecuting attorney, I should be able to see what data/webpages are related to the entity of interest, so I can explore potential pieces of evidence to help build my case file.
Tasks:
Database host URLs are sometimes pretty long and complicated. I think it would make life easier if we stored DB details and other config in a config file in the project instead of requiring the user to pass those details every time they want to run app.js
.
Create a REST API for the database. The endpoints could essentially emulate what the Data Access Objects do
Relevant files:
routes/index.rs
routes/daos
Entity search result data seems to disagree with results from manually querying the SQL database.
Tasks:
Currently the "freestyle search" search results links link to the wrong BackPage post. For example, the search result list shows a post with the title "This is a ad", clicking on the link for the result display a post with the title "This is a different ad."
Tasks:
Copy the useful bits of setup information on the wiki page to README.md so setup instructions have more visibility.
We need to update documentation and Markdown files (files ending in .md
) and remove old information (e.g. references to GitLab instead of GitHub, old labeling systems, etc).
Tasks:
README.md
CONTRIBUTING.md
LICENSE
to BSDRemove the swig
code in the current views
directory and instead use Angular, React, or some other cool JS framework to call the API
Dependencies:
Create a database that isn't nearly a quarter million rows for prototyping
Just a quick note about a lack of clarity in the docs: If a user installs and starts a postgres server themselves instead of using the Docker command that creates a user named dbadmin
, then running the queries in the Wiki to create and populate the crawler DB as user dbadmin
will result in a "role does not exist" error. If they try to execute the commands as user postgres
, loading data from the SQL dump into the database throws errors, since crawler.sql
specifies that the commands should be executed by a user named dbadmin
. We should specify in the Wiki that users who don't use Docker will have to create a superuser named dbadmin
for the commands in the wiki to work.
On a related note, to run the commands to create the database from the command line you have to run the commands as user postgres
like so $ sudo -u postgres psql -c "CREATE DATABASE foo;"
. Trying to run the commands as they appear in the wiki psql --username postgres -c "CREATE DATABASE foo"
fails because of a peer authentication error. See this SO answer for an explanation of why.
As a user, I should be allowed to enter a phone number in the entity resolution search as space separated (123 456 789), hyphen separated (123-456-789), or without any delimiter at all (123456789). Currently I can't do that, because the phone numbers are stored in the DB with hyphens as separators.
Instead of strictly enforcing the hyphen format, maybe we should strip out the hyphens from the existing phone numbers in the DB. The app can then strip out any non-numeric characters from the user's query on the backend, and the SQL query will still return matches. This will allow for more robust searching that doesn't break if a user doesn't use the xxx-xxx-xxxx phone number format.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.