quarkslab / irma-frontend Goto Github PK

View Code? Open in Web Editor NEW

25.0 26.0 3.0 5.41 MB

IRMA frontend

irma-frontend's Introduction

28 Nov 2016 - v1.5.3 - IRMA is now one single repository check <https://github.com/quarkslab/irma>

irma-frontend's People

Contributors

Stargazers

Watchers

Forkers

commial ufausther deloittem

irma-frontend's Issues

Progess on scan with no Probes

Description:
When there are no probes detected. One still can upload files, which then look like being processed. Moreover no error got displayed

Occurrence:
/

When:
Upload a file to scan with no probes

Log Trace:
None

scan events timestamp are truncated

probe list timeout display as api_error

API precisions

Those points are relative to the web API documentation (and may be considered as cosmetics):

the Launch a scan API specify the argument options without giving these so-called options. For instance, in the Web interface, the option force: True is used to force a re-scan. It would be great to have them directly in the documentation;
there is a typo in Cancel a scan: you'll recieved an Unexpected error.;
the documentation specify that You may not cancel a scan that has not been launched, you'll recieved an Unexpected error.. Actually, creating a new scan thanks to POST /scans and canceling it with POST /scans/{id}/cancel gives back:

{
u'status': 110,
u'probes_finished': 0,
u'results': [],
u'probes_total': 0,
u'date': 1438118972,
u'id': u'558c4830-efb0-4453-8599-0448863e3a09'
}

If I correctly understand, it means the scan has been canceled (without being launched).

Error when returning resutls

Probe scanned file, then frontend gets this?

{"msg": "User: Multiple results found instead of one", "code": -1}

Symantec Parsing issue

Symantec Parsing is strange it shows the following

EICAR Test String - c:\users\xxxx\appdata\local\temp\tmpxlrdhm

This appears in 1.0.1

Probes remain in "waiting"

Description:
Probes remain in "Probe still working... ", but progress bar shows completed and response with results was also sent.

Occurrence:
/

When:
Always

Log Trace:
None

How to reproduce:

Take your working copy of irma-frontend, irma-brain, irma-probes
checkout testing branch
restart everything

celeryd.brain
celeryd.results
celeryd.probe
cerleryd
uwsgi
nginx.
1. scan a file

scan events timestamp are truncated

New feature: ICAP Server

It will be nice to run an ICAP server to plug Squid with Irma for example (in fact replace clamav from SquidClamav with Irma).

We would like to set up a file rotation system for the frontend's /var/irma/samples. I have tried to remove some samples, but if I scan them again later (I force the scan to be performed with the REST API), celery and uwsgi raise an ftp upload error caused by an IOError (file not found). I guess that IRMA does not handle cases where samples have been deleted and should be put again in /var/irma/samples to be rescanned.

Would it be possible to fix this issue? If not, is there any recommended way to safely rotate files in /var/irma/samples to avoid disk saturation?

Thanks!

Florent

EDIT: The bug appears in IRMA 1.2.0, but the diffs with 1.3.0 do not seem to fix it, is it reproducible on your setup?

Add link to the full report while scanning

While scanning, it would be nice to have the full URL of the report instead of anly the scan ID (or both), which would be more useful to the end user ;)

Upload files with utf-8 encoded name

Hi,

It would be nice to be able to upload files with a name encoded in utf-8 rather than ascii (for example containing accents). One possible way to do that would be to escape(urldecode(upfile.raw_filename)) (at scans.py:146) rather than using the bottle escaping as is (which is what upfile.filename is, according to the Bottle doc.

I think it would be very valuable to users to be able to keep the file name they uploaded (and globally). What do you think?

Handle big files

Hi,

I noticed in at least two places that files are entirely read in RAM before being written to disk. This could cause issues if big files have to be scanned (an ISO for example, or a big document archive).

Two examples:

scans.py:147: there should be a way to use the file object rather than reading the whole file.
irma-common/irma.ftp.handler:178: the lambda function should could update a hash and write the data to the disk rather than storing the file chunks in RAM. This would allow never to have the whole file in RAM (the current implementation copies it at least two times in RAM).

Do you think it would be possible to fix these issues? (There must be other places where this is the case, these are just two examples I found.)

Cheers,

Florent

Bower dependencies prevent installation

Package dependencies within the bower tool, specifically relating to angular 1.3.0 stop installation (when using ansible)

workaround:

add this line to resolutions section of web/bower.json
"angular": "1.2.26"

New feature: ADMINISTRATION panel

New feature to add: ADMINISTRATION panel

The administration panel should be useful to customize IRMA directly from the WEB frontend.
The first example is for the TAGGING feature: having a tab to manage the tag(s):

Listing all available tag(s)
Add new tag(s) directly from the administration panel
Same with removing

Later, more feature could be added in the administration (e.g. managing YARA rules etc)

File list bug if scanning twice the same file

Description:
A filename appears twice in the list when after aborting a scan and submitting the same file for a second scan.

Occurrence:
/

When:
Always

Log Trace:
None

How to reproduce:

Submit a file for scanning
Cancel while uploading the file
Submit the same file and it will appear twice in the list

Deb packages wont install configuration files

The configuration files i.e. /etc/init.d/celeryd /etc/default/celeryd will not be deployed when installing 1.0.1 over the deb files on a clean system!

[testing] Search API return an error when accessing without page or per_page query

SEARCH feature enhancement (dynamic)

Small enhancement of search feature

It contains (as already shown to you) :

View title renamed to DATABASE (in the sense of "malware databse", "sample database")
When reaching the search view, all the files must be displayed by default
The results should now be displayed by alphabetical order
When using the search field, results table should be dynamically updated and filtered without submitting the form
Removing the search button since search is now dynamic
Add a reset button to the the view
this button must reset all the search fields

I'll make the pull request later on testing

Surprisingly high CPU load

Setup (for the frontend machine):

8 CPUs with 2.3 Ghz
8192 Mo RAM
7 UWSGI workers

Asking for scan status (through the API) for ~ 80 scans per seconds leads to a full CPU usage on the frontend (mainly due to UWSGI processes). It seems to be a surprisingly high CPU load considering the setup. Then, new scans become increasingly slow.

Incoherent scan state

Sometime, all probes are marked as finished, but the scan status is still set to "running" (in the result from the scan status API). The web interface shows "100% finished" but "scan state: running".

To reproduce: launch a large number of scan in a short time range (using the API), ie. a "stress test".

Incomplete API /results

Looking for the API /scans/<scanid>/results, the description says you'll get a results property containing the total count of scan results items. Accordingly, in the Model description:

{
  "total": 0,
  "offset": 0,
  "limit": 0,
  "data": [ { ... } ]
}

Actually, only data is returned (without naming it).
In addition, the parameters limit and offset are completely ignored (unlike frontend/api/controllers/scans.py: list()).
So, on a big scan, there is now way to get back just the number of result, and then iterate over it through a window [offset, offset + limit] (/scans/{scanId} and /scans/<scanid>/results return the full list of results).

Cosmetic one: the docstring of frontend/api/controllers/scans.py: get_results says The request should be performed using a POST request method, where it is actually a GET.

Search interface lost state when pressing back button

Reproduce:

make a search with some criteria. Browse details on one result, try to go back to search results, k-boum.

Idea:

Perhaps make the search request a get request on same page ?

Processing loops on an error

Description:
When the interface shows an error like "Error during launch", the page "_api/scan/add/535e444cf2d8ae21d8a167eb" is called in a loop

Occurring:
N/A

When:
Directly after an unsuccessful upload?

Log Trace:
None

Fast scan never finishes

While scanning a file with a real fast probe, sometimes scan never finishes although the results are sent back.

Problem with API /infected route

There is an error return when accessing at the /_api route. TO BE CONTINUED…

Comments feature

Adding a comment feature on scanned files (in details view).

Using this, all users will be able to add comments and share some information on the file.

Sqlite Operationnal Errors

need to retry sql operation on operationnal errors (that prevent two concurrent write to sqlite db).

Show link report to a file description

That would be nice to have a direct link that can be copied on a file result description. Something like "Link to this page" with a tooltip "Click to copy this link" !

With love, :)

Predictible scan id

Mongodb oid is predictible. One can get others scan results. This issue is known for a while.

Allow irma-frontend to use SSL with RabbitMQ

Hanging scan

A few times, the scan just "hang". For some file in the scan, all probes are marked as done, and others are marked as pending. But the scan will never evolve.

This as been raised by a stress test, such as #130 .

Possible race condition with hashes

To reproduce: ask for multiple scans with the same hash at the same time (using the API).
The code ends on an error raised by load_from_sha256, because there are several entries in File (in DB) for the same hash...

Maybe related: there is no unique constraint on hash fields in the database, but the case is already considered through the MultipleResultsFound exception catch.

Scan a file from a URL

That would be nice to have the ability to provide a URL that points to the file that you want scanned.

New feature: TAG(S) SETTING

New feature to add: a TAG SETTING at SELECTION view

For example, sometimes we have a collection of file(s) linked to a specific case and we would like to directly submit them all and tag them with a tag related to this case.

In case no TAG is specified: add a feature "DEFAULT TAG" which automatically use the default tag for each file (e.g. tag UNKNOWN by default if no tag(s) have been specified)

Softimeout reached while scanning large amount of files at once

[2016-02-25 10:30:32,799: WARNING/MainProcess] Soft time limit (60.0s) exceeded for frontend.tasks.scan_launch[01587d2e-fbc1-45fd-9042-372294085beb]
[2016-02-25 10:30:32,813: ERROR/MainProcess] Task frontend.tasks.scan_launch[01587d2e-fbc1-45fd-9042-372294085beb] raised unexpected: SoftTimeLimitExceeded()
Traceback (most recent call last):
File "/opt/irma/irma-frontend/current/venv/local/lib/python2.7/site-packages/celery/app/trace.py", line 240, in trace_task
R = retval = fun(_args, *_kwargs)
File "/opt/irma/irma-frontend/current/venv/local/lib/python2.7/site-packages/celery/app/trace.py", line 438, in protected_call
return self.run(_args, *_kwargs)
File "/opt/irma/irma-frontend/current/frontend/tasks.py", line 42, in scan_launch
scan_ctrl.launch_asynchronous(scanid)
File "/opt/irma/irma-frontend/current/frontend/controllers/scanctrl.py", line 338, in launch_asynchronous
_add_empty_results(scan.files_web, scan_request, scan, session)
File "/opt/irma/irma-frontend/current/frontend/controllers/scanctrl.py", line 127, in _add_empty_results
_add_empty_result(fw, probelist, scan, session)
File "/opt/irma/irma-frontend/current/frontend/controllers/scanctrl.py", line 106, in _add_empty_result
session.commit()
File "/opt/irma/irma-frontend/current/venv/local/lib/python2.7/site-packages/sqlalchemy/orm/scoping.py", line 157, in do
return getattr(self.registry(), name)(_args, *_kwargs)
File "/opt/irma/irma-frontend/current/venv/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 801, in commit
self.transaction.commit()
File "/opt/irma/irma-frontend/current/venv/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 402, in commit
self._remove_snapshot()
File "/opt/irma/irma-frontend/current/venv/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 302, in _remove_snapshot
s._expire(s.dict, self.session.identity_map._modified)
File "/opt/irma/irma-frontend/current/venv/local/lib/python2.7/site-packages/sqlalchemy/orm/state.py", line 422, in _expire
self.expired_attributes.update(
File "/opt/irma/irma-frontend/current/venv/local/lib/python2.7/site-packages/billiard/pool.py", line 235, in soft_timeout_sighandler
raise SoftTimeLimitExceeded()
SoftTimeLimitExceeded: SoftTimeLimitExceeded()

Multi-tab feature in details view

This feature is a prerequisite for adding other feature to the details view:

attachments
comments
...

Many ways to do it, a quite simple, easy and quick fix is suggested in PR:

Details view improvements
#96 opened on 27 Nov 2015 by deloittem

Enable HTTPs scheme in swagger configuration by default

Only HTTP scheme is enabled in swagger configuration. It seems that one can add multiple schemes.

We should enable HTTP and HTTPs by default. What do you think ?

Search feature not always returning the last scan

Hello,

I think there is a problem related to the search feature.

When a file has been scanned multiple times, sometime the search feature doesn't return the link to the last scan but to an older one. It is not always the case and happens randomly.

It seems related to the sqlobject query:
.distinct(FileWeb.name)

List of all scanned files

Hi,

It would be nice to have a page where all scanned files are displayed. Maybe the "Search" page could display all files by default?

Also, it would be nice if the files were displayed ordered by last scan timestamp, I think it would make more sense for the user.

Thank you if you have some time to think about it!

PS: actually, this patch allows to return all the files if no "name" or "hash" is specified in the /search/files request (or if they are empty):

diff --git a/frontend/api/controllers/search.py b/frontend/api/controllers/search.py
index f755c40..f6cf018 100644
--- a/frontend/api/controllers/search.py
+++ b/frontend/api/controllers/search.py
@@ -46,9 +46,9 @@ def files(db):
         offset = int(request.query.offset) if request.query.offset else 0
         limit = int(request.query.limit) if request.query.limit else 25

-        if name is not None:
+        if name:
             base_query = FileWeb.query_find_by_name(name, db)
-        elif h_value is not None:
+        elif h_value:
             h_type = guess_hash_type(h_value)

             if h_type is None:
@@ -56,7 +56,7 @@ def files(db):

             base_query = FileWeb.query_find_by_hash(h_type, h_value, db)
         else:
-            raise ValueError("Missing query attribute name or hash")
+            base_query = FileWeb.query_find_by_name("%", db)

         # TODO: Find a way to move pagination as a BaseQuery like in
         #       flask_sqlalchemy.

Not the best way to do it, but this kind of API may be useful.

Bower install issues

I see the follwoing after the bower installation

bower angular-strap#~2.0.2 validate 2.0.5 against https://github.com/mgcrea/angular-strap.git#~2.0.2
bower ENOTFOUND Package option not found

which seems that the re is something not available anymore or similar.

After this I cannot continue the installation

Indicate anvirus signature DB version or date in scan results

Hello

This is a feature request.
Antivirus scan results already indicate the antivirus version. It would be nice to also indicate the signature database version or date.

Thanks for considering this issue.
Best regards

[Feature-request] IRMA python + on_complete/blocking wait

Hi!

I would like to use the IRMA frontend API from a Python script.
So I got two points (if you consider their are not related, I can open a second issue).

Firstly, regarding a potential irma-python module, should I use the irma-frontend/frontend/cli/apiclient.py?
It would be great to have an external standalone module, with just a few dependency (like python-requests).
That way, one can launch a scan with (fictive sample API):

from irma import IrmaServer, IrmaScan
isrv = IrmaServer.from_url("http://irma.example.com")
scan = IrmaScan(isrv)
scan.add_files("bad.exe")
scan.launch()
...
print "Probes", isrv.probes
...
for scan in isrv.scans:
    print scan.results

Secondly, giving this kind of API, it would be great to have both blocking and non-blocking calls to wait for a scan to end.
That is to say, a first API (for instance, scan.wait()) which gives back control to caller when the scan is complete. And a second one (for instance, scan.on_complete(callback)), which can use the first one in a Thread, which is asynchronous and call an input callback when the scan is complete.

Please ask if you need to have more details/I'm not clear enough.

As IRMA system, when the file is not available on disk and the user submit it again, I should download it

See the problem and the recommendation from @fmonjalet there: #105.

quarkslab / irma-frontend Goto Github PK

irma-frontend's Introduction

irma-frontend's People

Contributors

Stargazers

Watchers

Forkers

irma-frontend's Issues

Small enhancement of search feature

Recommend Projects

Recommend Topics

Recommend Org