Comments (6)
Hi @Frooodle, i think this is a good issue that can assign someone like me who wish to contribute to open source 😃
My initial idea of solving this issue is to update collectFilesForProcessing
to ensure we only collects files that are fully copied, either by checking if the size is growing, or using some os level features (e.g. lsof)
let me if there is comment for the solution 😆
from stirling-pdf.
A good callout and bug
I will work on this over weekend
from stirling-pdf.
@kkdlau hows this going?
from stirling-pdf.
Here is an example. Haven't tested it. Also not an java expert. But could be leading into right direction.
PipelineDirectoryProcessor.java
// [...]
import java.util.concurrent.TimeUnit;
// [...]
public class PipelineDirectoryProcessor {
// [...]
private static final long STABILITY_CHECK_DELAY = 1000; // 1 second
private static final long STABILITY_CHECK_COUNT = 5; // Check 5 times
private File[] collectFilesForProcessing(Path dir, Path jsonFile, PipelineOperation operation) throws IOException {
try (Stream<Path> paths = Files.list(dir)) {
if ("automated".equals(operation.getParameters().get("fileInput"))) {
return paths.filter(path -> !Files.isDirectory(path) && !path.equals(jsonFile) && isFileStable(path))
.map(Path::toFile)
.toArray(File[]::new);
} else {
String fileInput = (String) operation.getParameters().get("fileInput");
return new File[] { new File(fileInput) };
}
}
}
private boolean isFileStable(Path path) throws IOException {
long initialSize = Files.size(path);
for (int i = 0; i < STABILITY_CHECK_COUNT; i++) {
try {
TimeUnit.MILLISECONDS.sleep(STABILITY_CHECK_DELAY);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new IOException("Thread interrupted during stability check", e);
}
long newSize = Files.size(path);
if (initialSize != newSize) {
return false;
}
}
return initialSize > 0; // Also ensuring the file is not zero bytes
}
// [...]
}
// [...]
from stirling-pdf.
@kkdlau hows this going?
Hi, was busy with my full-time work 😞
But I already have the draft of the PR
Just need to go through couple of regression testing to ensure it doesn't break the existing features
Will create a PR tnt (APAC time)👍🏻
from stirling-pdf.
Here is an example. Haven't tested it. Also not an java expert. But could be leading into right direction.
PipelineDirectoryProcessor.java
// [...] import java.util.concurrent.TimeUnit; // [...] public class PipelineDirectoryProcessor { // [...] private static final long STABILITY_CHECK_DELAY = 1000; // 1 second private static final long STABILITY_CHECK_COUNT = 5; // Check 5 times private File[] collectFilesForProcessing(Path dir, Path jsonFile, PipelineOperation operation) throws IOException { try (Stream<Path> paths = Files.list(dir)) { if ("automated".equals(operation.getParameters().get("fileInput"))) { return paths.filter(path -> !Files.isDirectory(path) && !path.equals(jsonFile) && isFileStable(path)) .map(Path::toFile) .toArray(File[]::new); } else { String fileInput = (String) operation.getParameters().get("fileInput"); return new File[] { new File(fileInput) }; } } } private boolean isFileStable(Path path) throws IOException { long initialSize = Files.size(path); for (int i = 0; i < STABILITY_CHECK_COUNT; i++) { try { TimeUnit.MILLISECONDS.sleep(STABILITY_CHECK_DELAY); } catch (InterruptedException e) { Thread.currentThread().interrupt(); throw new IOException("Thread interrupted during stability check", e); } long newSize = Files.size(path); if (initialSize != newSize) { return false; } } return initialSize > 0; // Also ensuring the file is not zero bytes } // [...] } // [...]
Thanks for the idea 👍🏻
My draft is quite similar except for isFileStable
implementation
will share more details when I open the PR
from stirling-pdf.
Related Issues (20)
- OCR Error, No such file or directory: '/tmp/ocrmypdf.io.fez4ih5m/000001_ocr_hocr.hocr HOT 4
- Error parsing template: "classpath:/templates/login.html" HOT 11
- May I ask how to deal with windows flash back, remind me to download the jdk every time, even if I download it, but every time I open it there is still this prompt, resulting in my inability to use HOT 4
- unable to open pdf in view-pdf HOT 1
- Existence of an empty class HOT 1
- Running as a service .env file doesnt work properly HOT 4
- Problem while adding a new user with using subdirectory (nginx proxy pass)
- [BUG] : SSO settings are wiped out at container start HOT 3
- "request" is null for Markdown to PDF HOT 3
- Setting default language does not take effect HOT 4
- Sign a PDF with your certificate: Show Signature options explanation? HOT 1
- Error parsing template "classpath:/templates/home.html"
- subpath change-creds is not working when SYSTEM_ROOTURIPATH is set HOT 4
- PDF viewer broken in (slightly) old version of chrome HOT 8
- App Version Number gone in Docker Container HOT 1
- Functionality Request: Multiple Images on one PDF page
- Functionality Request: Cleanup for printing
- error : convert file to pdf HOT 1
- Sign document with certificate: sign "stamp" not visible
- Service view PDFs HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stirling-pdf.