Giter Site home page Giter Site logo

exodus-core's Introduction

εxodus core

Build Status CodeQL

Contains:

  • Static analysis
  • Network analysis
  • Connection helper

Installation

exodus-core is available from PyPI:

pip install exodus-core

Include it to your project

Add the following line in your requirements.txt (replace 'XX' by the desired subversion):

exodus-core==XX

Local usage

Clone this repository:

git clone https://github.com/Exodus-Privacy/exodus-core.git
cd exodus-core

Using Docker

Build the Docker image:

docker build -t exodus-core .

Run tests:

docker run -it --rm exodus-core /bin/bash
python -m unittest discover -s exodus_core -p "test_*.py"

Manual installation

Install dexdump:

sudo apt-get install dexdump

Create Python virtualenv:

virtualenv venv -p python3
source venv/bin/activate

Install dependencies:

pip install -r requirements.txt

Run tests:

python -m unittest discover -s exodus_core -p "test_*.py"

exodus-core's People

Contributors

7homassutter avatar codeurimpulsif avatar counter-reverse avatar dependabot[bot] avatar freebejan avatar gu1nness avatar jean-baptistec avatar jspricke avatar pnu-s avatar u039b avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

exodus-core's Issues

update requirements.txt and setup.py for python 3.9

The command pip install -r requirements.txt works for virtualenv with version 3.5. It does not with 3.9. python 3.5 is also going to be deprecated. See the warning DEPRECATION: Python 3.5 reached the end of its life on September 13th, 2020. Please upgrade your Python as Python 3.5 is no longer maintained. pip 21.0 will drop support for Python 3.5 in January 2021. pip 21.0 will remove support for this functionality..

I have put the logs of the errors.

gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -DCYTHON_CLINE_IN_TRACEBACK=0 -I/usr/include/libxml2 -Isrc -Isrc/lxml/includes -I/usr/local/include/python3.9 -c src/lxml/etree.c -o build/temp.linux-x86_64-3.9/src/lxml/etree.o -w
  src/lxml/etree.c: In function ‘__Pyx_modinit_type_init_code’:
  src/lxml/etree.c:231155:32: error: ‘PyTypeObject {aka struct _typeobject}’ has no member named ‘tp_print’; did you mean ‘tp_dict’?
     __pyx_type_4lxml_5etree_Error.tp_print = 0;
                                  ^
  src/lxml/etree.c:231163:36: error: ‘PyTypeObject {aka struct _typeobject}’ has no member named ‘tp_print’; did you mean ‘tp_dict’?
     __pyx_type_4lxml_5etree_LxmlError.tp_print = 0;

For the moment, if you met a similar issue, you can skip with virtualenv --python=/usr/bin/python3.5 venv-3-5 instead of virtualenv venv -p python3 on the build instruction of the exodus core readme.md.

Detect if dexdump is not available

Hello,

In v2.0.2, if dexdump is not installed, detect_trackers() simply returns an empty list.
I think it should raise an exception instead, because the current behavior can be confusing.

Question: pip dependencies

Hi there,
I would like to use exodus in my project but it seems that some of your dependencies are no longer up-to-date. Since pip 20.3 nowadays enforces dependency conflicts to be resolved before I can install something I would like to ask if you could change the strict (==) version specifiers to a minimum (>=) or compatible (~=) version specifier in your requirements.txt.

exodus-core 1.3.1 requires cryptography==2.6.1, but you have cryptography 3.2.1 which is incompatible.
exodus-core 1.3.1 requires Pillow==6.2.2, but you have pillow 8.0.1 which is incompatible.
exodus-core 1.3.1 requires requests==2.21.0, but you have requests 2.25.0 which is incompatible.

Changing of course would mean that you have to test more. However, it would help to stay up-to-date with the latest changes as well.

Thx

Tom

Update GitHub actions to only push new version to pypi once

Currently, when pushing a new tag, GitHub actions tries to push the new version to pypi twice, once for each python version currently supported.

There is no real impact, except for the pipeline to fail.

We should make sure to only push it with one version of python.

TypeError: object of type 'generator' has no len()

From exodus-standalone, but it belongs here :

I installed exodus-standalone within venv and get

`ile "exodus_analyze.py", line 59, in
analysis.print_apk_infos()
File "exodus-standalone/venv/lib/python3.7/site-packages/exodus_core/analysis/static_analysis.py", line 483, in print_apk_infos
print('- App libraries: %s' % len(libraries))
TypeError: object of type 'generator' has no len()``


Commenting the three lines makes it run again, but then of course no libraries are shown ;-)

Use GitHub Actions to create release with tar file

We used to have Travis CI to push our generated tar archive to 2 places:

  • pypi
  • github releases

We now have github actions to push to pypi (the most important) but it does not do the same for github releases.

specify what a tracker does

Currently, exodus privacy only looks for tracker signature and does not care of the context. As told on the site https://exodus-privacy.eu.org/en/page/faq/#negatives

Our static detection method looks in applications for the presence of a defined list of trackers. If the signature of a tracker is detected in the analysis, its presence is indicated in the report. This is not a proof of activity of these trackers.

So I suggest to look for the signature of methods call instead of look for imports (actually, the application is looking for anything that contains trackers but this work is on progress #35). It will avoid to trigger false positives on applications that imports libraries of trackers but do not use these. Of course nothing forbid us to tell to the consumer that an unused tracker is present or to make another settings.

In a far future, we may want to use the events of the davilk vm to locate exactly when a tracker is called. Example: when a call on the method onResume() is found, we tell to the user each time when you resume your application, X tracker takes X informations.

It sounds hard but not impossible. I am more and more familiar with androguard. I let you some links to the doc that may help:

https://androguard.readthedocs.io/en/latest/api/androguard.core.bytecodes.html#androguard.core.bytecodes.dvm.DalvikVMFormat.get_methods_class

to find any function declarations

https://androguard.readthedocs.io/en/latest/api/androguard.core.bytecodes.html#androguard.core.bytecodes.dvm.EncodedMethod.get_instructions

to get the instructions of this function

https://androguard.readthedocs.io/en/latest/api/androguard.core.bytecodes.html#androguard.core.bytecodes.dvm.Instruction.show_buff

to list any tracker. We have to parse.

Good luck.

Some Firebase components are falsely reported as Firebase Analytics

Hi there, long time no see! 😄

I've stumbled across an interesting case I wanted to share with you, when analyzing an Android app 🤔

Context

  • the app uses Firebase Messaging, to deliver mobile notifications across several platforms (iOS + Android)
  • the app makes no use at all of Firebase Analytics, nor any related SDK
  • Firebase Messaging can be coupled to Firebase Analytics - it's totally optional, no hard dependency here
  • Firebase Messaging uses a small other package to check if the Analytics SDK is present or not : it's called Firebase Analytics Connector - and in my case, it only returns null

Result

  • exodus-core looks for (among others) com.google.firebase.analytics.
  • the connector package, named com.google.firebase.analytics.connector, triggers the rule as it contains the detection pattern
  • the app is flagged as using Firebase Analytics, while the "real" SDK is not embedded at all in the app (I extensively checked), and no tracking nor measurement is done 😅

Do you think something can be done about this? 🙏

List of changed reports

When we add new trackers in the database exodus will rerun all the reports.
It will be very interesting to have a list of the changed reports.

Update requirements

I made a new installation with the following dependencies:

requests==2.20.0  # previously 2.18.4
Pillow==5.0.0
dhash==1.3
gplaycli==3.21
protobuf==3.5.2.post1
jellyfish==0.5.6
cryptography==2.3.0  # previously 2.2.2
beautifulsoup4==4.6.0
androguard==3.1.0

pip freeze gives the following:

androguard==3.1.0
args==0.1.0
asn1crypto==0.24.0
backcall==0.1.0
beautifulsoup4==4.6.0
certifi==2019.3.9
cffi==1.12.2
chardet==3.0.4
click==6.7
clint==0.5.1
colorama==0.4.1
cryptography==2.3
decorator==4.4.0
dhash==1.3
future==0.17.1
gpapi==0.4.2
GPlayCli==3.21
idna==2.7
ipython==7.4.0
ipython-genutils==0.2.0
jedi==0.13.3
jellyfish==0.5.6
lxml==4.3.3
networkx==2.2
parso==0.4.0
pexpect==4.7.0
pickleshare==0.7.5
Pillow==5.0.0
prompt-toolkit==2.0.9
protobuf==3.5.2.post1
ptyprocess==0.6.0
pyasn1==0.4.5
pyaxmlparser==0.3.15
pycparser==2.19
pycryptodome==3.8.1
Pygments==2.3.1
requests==2.20.0
six==1.12.0
traitlets==4.3.2
urllib3==1.24.1
wcwidth==0.1.7

Note that, compared to the current version, the changes are the following:

  • click was added (new dependency)
  • joblib was removed
  • simplegeneric was removed

replace %s by {}

%s is a c style, it used to be used in python 2 and it should not be used in python 3 anymore.

I have already opened a pull request: #12. I did not open an issue yet.

Increase Specificity of dexdump Regex

I'm not sure if it will make a practical difference on which trackers are detected, but I noticed that the regex here seems unnecessary broad, and will match on things other than class names:

cmd = '%s %s/classes*.dex | perl -n -e\'/[A-Z]+((?:\w+\/)+\w+)/ && print "$1\n"\'|sort|uniq' % (
dexdump, tmp_dir)

Here is a short example of what I mean. Here are the first 20 lines from running dexdump on WhatsApp:

dexdump work_dir/tmp_dir/classes.dex | head -n 20
Processing 'work_dir/tmp_dir/classes.dex'...
Opened 'work_dir/tmp_dir/classes.dex', DEX version '035'
Class #0            -
  Class descriptor  : 'La/a/a/a/a/a$a;'
  Access flags      : 0x0011 (PUBLIC FINAL)
  Superclass        : 'Ljava/lang/Object;'
  Interfaces        -
  Static fields     -
  Instance fields   -
    #0              : (in La/a/a/a/a/a$a;)
      name          : 'a'
      type          : 'Ljava/lang/String;'
      access        : 0x0001 (PUBLIC)
    #1              : (in La/a/a/a/a/a$a;)
      name          : 'b'
      type          : 'Ljava/lang/String;'
      access        : 0x0001 (PUBLIC)
    #2              : (in La/a/a/a/a/a$a;)
      name          : 'c'
      type          : 'Ljava/lang/String;'

And here is the result of running the regex which the code is currently using. Note the matching on the superclass and on instance field types.

dexdump work_dir/tmp_dir/classes.dex | head -n 20 | perl -n -e'/[A-Z]+((?:\w+\/)+\w+)/ && print "$1\n"' | sort | uniq
a/a/a/a/a/a
java/lang/Object
java/lang/String

I've written a slightly different regex which is more specific, so it will only match on the Class descriptor:

dexdump work_dir/tmp_dir/classes.dex | head -n 20 | perl -n -e'/\s*Class descriptor\s*:\s*\047L((?:\w+\/)+\w+)\W/ && print "$1\n"'  | sort | uniq
a/a/a/a/a/a

I would be curious to see whether this change has any downstream effects (i.e., on what trackers are detected). If this change looks good, I can include it in a PR, potentially addressing #7 at the same time.

EDIT: Fixed my regex. It didn't account for some class descriptors not including a $ or starting with a variable number of whitespace characters.

Allow gplaycli usage without token dispenser

The download of application details with gplaycli can only use a token dispenser, which is currently malfunctioning.

We need to somehow make it work with a fixed gmail account (read from gplaycli config file).

Tests fails because https://matlink.fr cannot be accessed

Hello,

I tried running tests, but one of them fails as https://matlink.fr/token/email/gsfid which is hardcoded at line 269 of exodus_core/analysis/static_analysis.py cannot be accessed.

Is there another working URL that could be used instead ?
Please note it's also the one used in default gplaycli.conf, meaning the project won't work out of the box.

I tried setting (by using this example: https://github.com/matlink/gplaycli/blob/master/example_credentials.conf):

        gpc.token_enable = False
        gpc.gmail_address = 'gmail_address'
        gpc.gmail_password = 'password'

But then I get ERROR:root:'GPlaycli' object has no attribute 'token_url'

Here is the full test logs with un-modified exodus-core source code:

python3 -m unittest discover -s exodus_core -p "test_*.py"
invalid decoded string length
invalid decoded string length
invalid decoded string length
..WARNING:root:Unable to get the icon from the APK - downloading from details
ERROR:gplaycli.gplaycli:cache file does not exists or is corrupted
ERROR:root:HTTPSConnectionPool(host='matlink.fr', port=443): Max retries exceeded with url: /token/email/gsfid (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x113606630>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))
WARNING:root:Unable to get the icon from details - downloading from GPlay
ERROR:root:Unable to download the icon from Google Play
ERROR:root:Unable to download the icon
ERROR:root:Unable to save the icon
F...
======================================================================
FAIL: test_icon_diff (analysis.test_exodus_analyze.TestExodus)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/FrancoisDupayrat/Documents/exodus-core/exodus_core/analysis/test_exodus_analyze.py", line 52, in test_icon_diff
    self.assertEqual(phash_4, 325352301465779383961442563121869825536)
AssertionError: '' != 325352301465779383961442563121869825536

----------------------------------------------------------------------
Ran 6 tests in 71.067s

FAILED (failures=1)

Command injection with malicious apk

Hello,

there is a command injection in the get_embedded_classes method. A malicious APK with a file named classes.;id;.dex will result in the execution of the id binary in the context of the exodus-code process.

I would recommend using glob instead, and to defer all the post-processing of the dexdump command to Python, instead of using Perl, uniq and sort.

I might submit a patch to fix this later this week-end ♥

Unable to parse x509 certificates which violate RFC

Issue found in report 7-14-2135, ...

Traceback (most recent call last):
  File "manage.py", line 22, in <module>
    execute_from_command_line(sys.argv)
  File "/home/exodus/exodus/venv/lib/python3.5/site-packages/django/core/management/__init__.py", line 364, in execute_from_command_line
    utility.execute()
  File "/home/exodus/exodus/venv/lib/python3.5/site-packages/django/core/management/__init__.py", line 356, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/home/exodus/exodus/venv/lib/python3.5/site-packages/django/core/management/base.py", line 283, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/home/exodus/exodus/venv/lib/python3.5/site-packages/django/core/management/base.py", line 330, in execute
    output = self.handle(*args, **options)
  File "/home/exodus/exodus/exodus/reports/management/commands/refreshapksignature.py", line 58, in handle
    report.application.app_uid = static_analysis.get_application_universal_id()
  File "/home/exodus/exodus/venv/lib/python3.5/site-packages/exodus_core/analysis/static_analysis.py", line 328, in get_application_universal_id
    for c in self.get_certificates():
  File "/home/exodus/exodus/venv/lib/python3.5/site-packages/exodus_core/analysis/static_analysis.py", line 337, in get_certificates
    cert = Certificate(c)
  File "/home/exodus/exodus/venv/lib/python3.5/site-packages/exodus_core/analysis/static_analysis.py", line 44, in __init__
    self.issuer = Certificate.get_Name(cert.issuer, short = False)
  File "/home/exodus/exodus/venv/lib/python3.5/site-packages/cryptography/hazmat/backends/openssl/x509.py", line 102, in issuer
    return _decode_x509_name(self._backend, issuer)
  File "/home/exodus/exodus/venv/lib/python3.5/site-packages/cryptography/hazmat/backends/openssl/decode_asn1.py", line 66, in _decode_x509_name
    attribute = _decode_x509_name_entry(backend, entry)
  File "/home/exodus/exodus/venv/lib/python3.5/site-packages/cryptography/hazmat/backends/openssl/decode_asn1.py", line 57, in _decode_x509_name_entry
    return x509.NameAttribute(x509.ObjectIdentifier(oid), value, type)
  File "/home/exodus/exodus/venv/lib/python3.5/site-packages/cryptography/x509/name.py", line 50, in __init__
    "Country name must be a 2 character country code"
ValueError: Country name must be a 2 character country code

Refers to pyca/cryptography#3857

Add module to Pypi

There is a currently a version of exodus-core in Pypi but it is not usable: https://pypi.org/project/exodus-core/
We should see to add the module into Pypi with its different versions and the documentation.

It would allow:

  • an easier import for developers who want to integrate exodus-core (including us :))
  • an easier way for us to see who uses exodus-core

currently use dexdump to analyze apk files

For the moment the core of exodus embeds dexdump to perform analysis.

This choice makes exodus louder in term of space on the disk, less flexible and makes also the code less maintainable.

So I suggest to replace the usage of dexdump by code that perform analysis. The choice of the library to use is a great question. I let you give your opinion on this issue.

When we will find a good library, I will be ready to contribute to include it into exodus privacy.

Good luck to choose the library.

Discussion section for public "Technical only" Discussions

Hi today I found your project and I will be happy to contribute to your projecst and since this core is Copyleft and AGPLv3+ I am more than excited So We need only a communication channel which is called "Discussions" on GitHub for a separate section that will be easy to follow "Discussions"

Thanks and Happy Privacy!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.