Giter Site home page Giter Site logo

ornias1993 / fetlife-aslsearch-reborn Goto Github PK

View Code? Open in Web Editor NEW

This project forked from fabacab/fetlife-aslsearch

6.0 7.0 2.0 3.08 MB

Tampermonkey user script offering an interface to perform pseudo-automatic searches of the FetLife.com user base filtered by age, sex, location, and role.

JavaScript 63.59% HTML 29.40% Python 2.91% TSQL 2.43% CSS 1.67%

fetlife-aslsearch-reborn's People

Contributors

fabacab avatar ornias1993 avatar schroedingerhepcat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

fetlife-aslsearch-reborn's Issues

Turn Tampermonkey script into dedicated extention for Firefox and Chrome

Description

Currently Fetlife ASLsearch reborn uses Tampermonkey scrupting. It might be worth investigating turning it into extentions.

Category

TODO

Detailed Bug Report

Moving fetlife ASL search into an extention gives us some extra options.
One of which is spoofing the referer header which would solve #15. Another one maybe implementing some sort of verification of the plugin/extention, to prevent users sending corrupting scraper data.

Column limitation in search options not working as intended

Description

Currently one can select columns to display before search and afterwards using "datatables, colvis". selecting tables beforehand leads to strange behavior of colvis

Category

TODO/BUG

Detailed Bug Report

Besides the above its redundant with colvis working as it should and leads to needless codebloat. Keep It Simple Stupid.

Steps to Reproduce

Please enter the steps to reproduce the bug or behaviour:

  1. select some (but not all) limitations under the fourt submenu for selecting search options.
  2. Try flicking some colvis switches.

Location in database is sorted wrong

Location scraped wrong before its send to the extended search database.

The current (newish) format fetlife uses is:
/p/Country/Region/Locality

Where not every region has a locality.
When there is no locality set, the current scraper saves the country as region.

2019-06-11 (2)

Check firefox for bugs

Isn't tested yet with firefox, got reports it might not be showing up, needs testing.

Some pages are not getting scraped

Description

Some pages (for example users at a certain location) are not getting scraped

Category

BUG

Detailed Bug Report

It seems the classes and ID's are different, it seems there are two wholly different "fetlife" formats. The second format has no scraper code

Steps to Reproduce

Please enter the steps to reproduce the bug or behaviour:
1.
2.
3.

Design, Test and Document a workflow to migrate google docs databases

Description

With the transfer to SQL also comes the need to transfer the current database.
Although we currently have a 4,5 milion entry database within SQL, the resolution (number of colums filled) is a lot higher and it's a lot more up-to-date with the freshly scraped data

Category

TODO

Client not always downloading auto-update

Description

Clients that have manualy copy-pasted the plugin, do not auto-update the plugin

Category

BUG

Detailed Bug Report

Steps to Reproduce

Please enter the steps to reproduce the bug or behaviour:

  1. Copy paste plugin into a new tampermonkey script
  2. Wait for new version
  3. See no auto update

Use Colvis to hide less usefull columns by default

Description

Colvis can be used to decide which columns to hide. Thats usefull.

Category

TODO

Detailed Bug Report

at 0.5.6 default visibility is decided by the settings referenced in #39 as those are removed default visibility should be set using datatables colvis to prevent showing less used data

Steps to Reproduce

Please enter the steps to reproduce the bug or behaviour:
1.
2.
3.

Replace Colvis with buttons

The currently used colvis plugin for datatables has been depricated for ages now. Needs to be replaced with its replacement "buttons"

Location does not support fetlife location string

Currently location search can only use one location (either 1 city, 1 region or 1 country) without comma's. It would be nice if it supports the whole fetlife way of writing down location "City, Region, Country" and subsets of that.

Print not working

Print was not giving a nice printer popup and what did came out (pre 0.5.5 ) was ugly

Column title in search results is wrong

Description

Column name for city and province/state is wrong.

Category

BUG

Detailed Bug Report

Steps to Reproduce

Please enter the steps to reproduce the bug or behaviour:

  1. Do a search
  2. look at column name
  3. Example: column name of "Locality" should be: "City"

Warning about paid members having high rapist changes

There is a warning in the code about paying members having high chances of being rapists. Personal discrimination based on what organisations you financially support, is NOT okey and might be illegal under some jurisdictions.

FAADE is already removed due to being illegal in A LOT of jurisdictions.

Slowdowns/Freezes on certain pages

On chrome Fetlife seems to slowdown-freeze on certain pages when running ASLSearch.
I am not fully sure ASLSearch is the culprid, but this has to be noted down.

Pages where it happens:

  • searching and clicking "places"
  • settings page/popup

Pages that are fine:

  • Profiles
  • Groups
  • Homepage

Google is going to block more mixed content

Description

It seems google is going to block more mixed content.
This might create a need to move the frontend from a webserver to the plugin. I cant be sure our current iframe workaround is going to keep working

Category

TODO

Detailed Bug Report

No UX for GET failures when executing search

Description

Currently errors when searching are not reflected in the actual UI.

Category

TODO/Feature?

Detailed Bug Report

When searching is done, it would be nice if the "please wait" would be removed, the bar blinking red (instead of the green when results come in) and an error message being displayed.

Steps to Reproduce

Please enter the steps to reproduce the bug or behaviour:

  1. Make a search fail
  2. Notice endless "please wait..."

Make sure the node.js version and database get hosted.

Description

Currently everything is hosted on google.
This needs to be moved to a node.js/sql server.

I'm currently working on getting a server to run it on.
Sadly enough this takes time and money.

ETA of 0.6.0 going live is between August and December

Category

TODO

Detailed Bug Report

This is more of a "private-ish" note and although no one can do much about it i'm afraid, I do take care to be transparent about it.

Done:

  • Get some space to put servers
  • Make sure it has Internet and Power
  • Get a 2U case
  • Getting UPS
  • Get a decent patchpanel (24p)

Currently working on:

  • Get some ryzen server hardware, PSU, RAM, HDD's
  • Getting a more decent switch

Low priority:

  • Getting KVM

Pictures not showing

Currently the user pictures are not showing in the search results.
Possibly due to some security changes done by fetlife.

Data storage is weird and should be fixed

Data storage for the server side is currently a mess of google sheets documents (no this is not a joke :P ). This should be migrated to a more platform friendly solution (and keep kinkdata out of google.

ASL search overlay ugly

When opening ASL search, it disfigures the fetlife website (primarily it moves the top bar of fetlife down).

While it is working now, this should look WAY nicer.

Adding export buttons

Previous maintainer never got export buttons to work. Would be nice getting those in.

More donation removal

I removed some of the most annoying donation spam already.
But it need more love

The plugin should just have one request for donations at all (maybe with two buttons).

Nickname not showing

Description

Current live version doesnt seem to show nickname

Category

BUG

Detailed Bug Report

Possibly due to hiding of the avatar

Steps to Reproduce

Please enter the steps to reproduce the bug or behaviour:

  1. Do a Search
  2. Look at nickname

Remove unused loadingbar code

The "filling" loadingbar has been replaced with a "pre filled" bar including the text "please wait". (see #27 )

However, some of the code doing the filling remains and needs to be cleaned

Create seperate markdown for external sites

Description

Sleazy and Greasy fork can use an auto-updated markdown. But the current Readme features too much content, those sources should just be a "info" page, not the whole readme

Category

TODO

Detailed Bug Report

Also it seems pictures don't go well with Sleazy and Greazy fork.

Steps to Reproduce

Please enter the steps to reproduce the bug or behaviour:
1.
2.
3.

Loading bar does not represent actual search progress

Loading bar does not represent actual search progress and it confuses the crap out of people.

It should either show the actual progress or be a different loading bar (with a small bar going round in a big empty loading bar, like the old windows "wait" bar).

idea

Dear Ornias,

I have an idea.

Since this script didn't work for me, and I'm a developer myself, how about we make a separate searchable database of fetlife accounts?

Basically, the tampermonkey script would only parse the fetlife pages that the user opens and scrapes all the information that the user sees about other users - page URL, age, avatar URL, sex, kink, etc and then POST to a remote database.

Then, the actual search would be against that database (without crawling fetlife). It would be super-fast and almost accurate.

Fetlife will not be happy about this, but we could host this on some bulletproof hosting.
I'm a PHP/Mysql developer and can help with the backend.

Limit max connections on Node.js

Description

Currently it would be possible to flood the API with requests.
While it should be able to handle significant API requests, it should not be floodable.

Best would be two limits:

  1. On GET
  2. On PUT

Limiting searches to 2 or 3 per minute per user would be fine.
The POST API could get more load, but also closes the connection way faster. I think 25 per 5 minutes would be prudent

This would not prevent against DDOS however, DDOS protection could be added by limiting the connections to the SQL server and using decent networking solutions. That way the service would still go down, but with minimal damage to other services.

Suggested plugin for per-user rate limiting:
https://www.npmjs.com/package/express-rate-limit

Category

TODO

Input validation and sanitization rework

Description

Input validation and sanitization is a mess, at some needed spots it isn't there and in others its halfway done. This is a security issue

Category

TODO

Detailed Bug Report

For PUSHed scrape results we should:

  • Prevent XSS attacks
  • validate the format and throw everything out of wrong
  • Validate if an entry contains a valid user ID or throw the whole entry out
  • Validate columns and throw unknown ones out (+ their value!)
  • Sanitise: Make sure SQLi attacks cant be done and inputs are consistent

For GET requests we should:

  • validate the format and throw everything out of wrong
  • Validate columns and throw unknown ones out (+ their value!)
  • Validate inputs to be what they should be (for example: url being an actual fetlife url) and set to null if not
  • Sanitise: Make sure SQLi attacks cant be done and inputs are consistent

Some sources of user data not scraped

Description

Some sources of user data are not scraped. It would be nice if we could scrape more data with every pageview.

Category

Feature

Detailed Bug Report

For example: replies in a thread, posts on a wall, or friend avatars to the left of a profle

Steps to Reproduce

Please enter the steps to reproduce the bug or behaviour:

  1. Turn debug on
  2. Open a profile
  3. See only one scraped entry

Users can craft to heavy search queries on large datasets

Description

On the 4,4 milioen row test dataset, it is possible to still create very slow queries. This should be prevented, as it also puts load on te server and facilitates abuse.

Category

BUG

Detailed Bug Report

Most users would not be affected by such heavy queries, UX expectations are that people would almost always enter a location, gender, age-range or role. This would in itself ensure people at least hit a index.

On the User side of things a Timeout would occur after (at time of writhing) 20 seconds. However this does not cancel the executed SQL query on the server side.

While it would always be possible to DOS a system if willingly, users should be prevented to enter index-less search queries. We can do this by enforcing the use of at least gender and age.

We can also prevent abuse by stacking such queries by limiting the number of concurrent connections per IP, how this could be done in practice is still up for debate.

Steps to Reproduce

Please enter the steps to reproduce the bug or behaviour:

  1. Craft a search result that is not covered by indexes and has almost no results

CSS, JS and HTML should be seperated

Currently some HTML or JS files also contains JS, CSS and/or HTML, this should be cleanly seperated in the future. The plugin might be the exception when it comes to HTML in a JS file, but should be structured more cleanly. Even in that case CSS isn't needed in a JS file.

Write documentation for local installation of Server

Description

The arrival of the Node.js based server, also comes the possiblity for local hosting and local development environments. A clear wiki article on how to set it all up should be made.

Category

TODO

Validate string inputs

Description

  • Validate inputs to be what they should be (for example: url being an actual fetlife url) and set to null of not

Category

TODO/feature

Detailed Bug Report

Currently strings are not actually checked themselves. For example "shit" could be send as a country, and "dfrdefhydfh" as an url. We should sanitize those. Country could use the known country list used by the search form and maybe this could be merged with #26

Hosting platform choice is limited

The hosting platform for the server side is bound to google scripts.
For now this is fine, but should be rewritten to be more compatible with alternative hosting solutions. This also makes CI integration easier.

Project frontpage needs update

Description

Project frontpage is a tad ugly (aka: Not 2019 material)

Category

Feature

Detailed Bug Report

Steps to Reproduce

Please enter the steps to reproduce the bug or behaviour:

  1. Open project
  2. Look at frontpage
  3. Meh...

Hyperlink under nickname in searchresults broken

Description

When you click a hyperlink on a nickname in the searchresults you get a google error page instead of the fetlife profile

Category

BUG

Detailed Bug Report

Steps to Reproduce

Please enter the steps to reproduce the bug or behaviour:
1.
2.
3.

Not scraping groupmemberlist

Description

Group member lists are not getting scraped, it stops at the name of the first member

Category

BUG

Detailed Bug Report

Steps to Reproduce

Please enter the steps to reproduce the bug or behaviour:

  1. Open member list of group
  2. Check database mutations

Avatars on homepage are not getting scraped

Description

Currently avatars on discussions and profiles (wallposts, friendlists etc) get scraped. Avatars on personal homepages do not

Category

BUG/TODO

Detailed Bug Report

The way those are setup is pretty close, except those on the homepage are inside a table and for some reason jquery is throwingup trying to process the same content inside a table.

It would be nice to scrape those, MOAR data.

Steps to Reproduce

Please enter the steps to reproduce the bug or behaviour:

  1. Turnon debug
  2. Test scraping a profile page
  3. Test scraping you personal homepage.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.