Comments (14)
Update, it was the ProxyHtmlEnable directive. From the documentation it appears to be required, but commenting it out has resolved the issue.
You couldn't load it because it's a module that probably wasn't enabled on your test system.
It's also got a known bug that matches our issue https://bz.apache.org/bugzilla/show_bug.cgi?id=64339
I'm going to do some more apache research, you can close this issue. Expect a pull request with some updated proxy documentation (which is the only change I would think is needed here)
from sharry.
Other notes - this is the docker image, using postgresql as the for the backend.
from sharry.
Oh my…. Thank you for reporting! I'll take a look. Really curious on the reason to this, especially that it is related to the content type.
from sharry.
I haven't tested this hypothesis out, but could it be four letter extensions? (I'll upload a jpeg to see if it gets twisted) Tested it, an uploaded JPEG with extension jpeg isn't corrupted.
I can provide a shares and aliases for testing if that helps.
from sharry.
Renaming a word document to .zip, uploading it and downloading it leaves things uncorrupted.
from sharry.
Link to two identical files, uploaded with different extensions, one downloads clean, the other corrupt.
https://sharry.kent-school.edu/app/open/6jDQxbREegR-d76TEmfuNeN-zx15TBwVC1e-xyaxffbbxCe
from sharry.
One more important fact - this instance is behind an apache reverse proxy - Below is the virtualhost entry. I haven't replicated the issue in a different environment.
<VirtualHost sharry.kent-school.edu:443>
# The ServerName directive sets the request scheme, hostname and port that
# the server uses to identify itself. This is used when creating
# redirection URLs. In the context of virtual hosts, the ServerName
# specifies what hostname must appear in the request's Host: header to
# match this virtual host. For the default virtual host (this file) this
# value is not decisive as it is used as a last resort host regardless.
# However, you must set it for any further virtual host explicitly.
#ServerName www.example.com
ServerAdmin webmaster@localhost
# Available loglevels: trace8, ..., trace1, debug, info, notice, warn,
# error, crit, alert, emerg.
# It is also possible to configure the loglevel for particular
# modules, e.g.
#LogLevel info ssl:warn
ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log combined
ProxyBadHeader Ignore
SetEnv no-gzip On
RewriteEngine On
RewriteRule ^/app/assets/sharry-webapp/1.2.0i/favicon/.*.png https://www.kent-school.edu/favicon.ico [R]
SetEnv proxy-sendchunks On
ProxyPass "/" "http://localhost:4090/" timeout=2400 keepalive=on
ProxyPassReverse "/" "http://localhost:4090/"
ProxyHTMLEnable On
# For most configuration files from conf-available/, which are
# enabled or disabled at a global level, it is possible to
# include a line for only one particular virtual host. For example the
# following line enables the CGI configuration for this host only
# after it has been globally disabled with "a2disconf".
#Include conf-available/serve-cgi-bin.conf
ServerName sharry.kent-school.edu
SSLCertificateFile /etc/letsencrypt/live/sharry.kent-school.edu/fullchain.pem
SSLCertificateKeyFile /etc/letsencrypt/live/sharry.kent-school.edu/privkey.pem
Include /etc/letsencrypt/options-ssl-apache.conf
</VirtualHost>
from sharry.
Great - Thank you for the test files!
from sharry.
I don't have comfort enough in psql to get you the records from the chunkdata table for an example (I can do a query, but I don't know how to get it into a file for you)
from sharry.
Don't worry! I should be able to reproduce it with your files and then I see where it leads me. Thank you for all the info. If I get stuck I'm coming back to you with questions :-)
from sharry.
Hello, I just tried it quickly with my local setup (no reverse proxy) and couldn't reproduce it. I downloaded your files and it showed the problem, one is 350K the other 234K, really strange!
Could you maybe try the following to check whether the database contains correct content? (it seems so by looking at the lengths): Edit the share description and put in this:
{{#files}}
- {{name}}: `{{checksum}}`
{{/files}}
When saving the share, it should show all attachment filenames and their sha256 checksum as stored in the db (you could also query the filemeta
database table). They should be the same in the example from above (for me it is a2af46e7…c2e2c64
). If they are the same, could you then try to download without the apache proxy in front? If they are different, there is an upload problem. I'm going to setup an apache here and see if I can reproduce it.
from sharry.
I just remembered that the checksum is also sent with an ETag
header in the response. They are both equal, so I now assume that the db contains the correct data and it is rather related to downloading. When I run the downloads via curl -vv
I see this:
"bad" file:
curl -v --output /dev/null https://sharry.kent-school.edu/api/v2/open/share/6jDQxbREegR-d76TEmfuNeN-zx15TBwVC1e-xyaxffbbxCe/file/BuJtZYi82Gg-1mzogfXtLG8-wxwED3XiuWG-LzHXMdt9UbE
< HTTP/1.1 200 OK
< Date: Thu, 07 May 2020 19:24:51 GMT
< Server: Apache/2.4.29 (Ubuntu)
< Content-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document;charset=utf-8
< Accept-Ranges: bytes
< Last-Modified: Thu, 07 May 2020 13:30:29 GMT
< Content-Disposition: inline; filename="Document.docx"
< ETag: "a2af46e7745897f3a830e9b0b2de90d395d6a759e7049c663c7f155bfc2e2c64"
< Transfer-Encoding: chunked
"good" file:
curl -v --output /dev/null https://sharry.kent-school.edu/api/v2/open/share/6jDQxbREegR-d76TEmfuNeN-zx15TBwVC1e-xyaxffbbxCe/file/E7PkezZjVLP-cMvKnDTUW52-LEzymkU1ho6-imJpkZoRs4c
< HTTP/1.1 200 OK
< Date: Thu, 07 May 2020 19:32:20 GMT
< Server: Apache/2.4.29 (Ubuntu)
< Content-Type: application/zip
< Accept-Ranges: bytes
< Last-Modified: Thu, 07 May 2020 13:28:21 GMT
< Content-Disposition: inline; filename="document.zip"
< ETag: "a2af46e7745897f3a830e9b0b2de90d395d6a759e7049c663c7f155bfc2e2c64"
< Content-Length: 240461
For the first file, the ;charset=utf-8
looks suspicious, but I don't think it is related. Then it seems that apache sends chunked responses. Sharry may also send chunks, so maybe apache messes it up somehow? It is really strange, that it only applies to the docx file and not the zip version, which are both identical…. Unfortunately, I'm not at all familiar with apache configuration. Another thing I could imagine is that apache won't compress already compressed files (like zip or jpg) but tries to do this to other files it thinks are not compressed which then results in chunked transfers (which should work actually, but may cause this…). If you could verify that the error also exists/not exists without apache, that would be helpful I think.
For comparison, I uploaded the same file here: https://box.daheim.site/app/open/4qPTg3UhYkT-z4rw2X9qi5g-BTgpe1jb3HB-x2Jb5oT6RLo
I tried with an Apach reverse proxy here, too. I used Apache 2.4.43. I used your config where possible (there is a huge default apache config before this virtual host and I tested without tls). I had to remove the line ProxyHTMLEnable On
, because my apache wouldn't start otherwise. With this setup I could not reproduce it either.
< HTTP/1.1 200 OK
< Date: Thu, 07 May 2020 20:49:00 GMT
< Server: Apache/2.4.43 (Unix)
< Content-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
< Accept-Ranges: bytes
< Last-Modified: Thu, 07 May 2020 20:46:18 GMT
< Content-Disposition: inline; filename="document_orig.docx"
< ETag: "a2af46e7745897f3a830e9b0b2de90d395d6a759e7049c663c7f155bfc2e2c64"
< Content-Length: 240461
from sharry.
When not going through the reverse proxy, it downloads clean, no chunking.
If I tell apache to not chunk its transfers,
it still gets chunked (which leaves me suspecting that it's being chunked by sharry in that configuration)
If your apache doesn't have the proxyhtmlenable directive set, then urls in the returned webpage don't get rewritten (I expect that your test environment it's still making api calls to the unproxied server)
At least we know that the x-factor is definitely an apache+ssl+reverseproxy. Additionally I'm taking comfort that it's only showing up with downloads at the moment.
from sharry.
That's good to hear! I think (with my understanding of this directive) regarding sharry this directive is not necessary, if you deploy sharry at the root path. The only HTML that is returned from the server is just one page that loads the javascript application. The links in there are without the hostname, e.g. /app/assets/sharry-webapp/1.3.0/sharry-app.js
. The rest is all covered by setting the base-url
to the "outside" url.
A PR with updated docs would be great, of course. Thank you.
from sharry.
Related Issues (20)
- Allow to change schema when using postgresql
- s3 auth does not seem to support node / service account roles - access key and secret required HOT 3
- Problems with using Keycloak as OAuth provider HOT 6
- Azure AD Authentication doesn't work with java.net.ConnectException: Connection timed out HOT 6
- CORS blocked when uploading a file using Caddy reverse proxy. HOT 5
- nix: build sharry from source
- Error message : 106% percent over 100 HOT 6
- mp4 files are not playing in browser HOT 10
- Add ability to create admin accounts with all auth modules HOT 2
- Broken Nix installation guide HOT 1
- Sign-in with oauth (via Google) creates accounts even if signup is set to "invite" or "closed" HOT 4
- Site breaks in Chrome/Edge when `require-trusted-types-for` is enabled in CSP header HOT 3
- Shares disappearing after reboot HOT 2
- mp4 files are not playing in browser - maybe related to #1328 HOT 1
- Disable new user signup HOT 2
- Unraid support HOT 1
- Possible to set longer expirations HOT 1
- Nginx HOT 1
- Run Sharry as a service HOT 3
- [Feature request] Publish container to Github container registry HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sharry.