Giter Site home page Giter Site logo

Comments (9)

shemanaev avatar shemanaev commented on May 26, 2024

Server itself targeted to use index files called inpx and don't provide a way to scan filesystem iteself. I'm not tested in other than fb2-ready inpx files scenarios so there might be (and will, i'm sure 😄) bugs.
But basically you need to:

  • produce .inpx file in some way for every root directory (i.e. if you have c:\lib1 and c:\lib2 you'll need two files)
  • if you want to have info that not fit into inpx format (cover, annotation) you'll have to implement IBookParser and register it to BookParsersPool
  • import every .inpx with related root (i.e. dotopds import c:\lib1 lib1.inpx)

The .inpx format description i found only in russian, so here is translation

from dotopds.

gerritv avatar gerritv commented on May 26, 2024

Thank you, that helps me a lot. I have been reading the code and understand more than when I opened the Issue :-)
I can generate the .inpx from my PDF parser, will test that out and then decide what to do next.
I am impressed with the design, it looks very expandable.

from dotopds.

gerritv avatar gerritv commented on May 26, 2024

I have the pdf scanner added (Utils/PdfParser.cs), I chose to recursively scan the directory and process each pdf rather than creating an intermediate file. I didn't add another parser to Parsers, the generic one there is sufficient as the Class in Utils does all the work, using InpxParser.cs as a template.

Pondering how to add it to the commands. Would it be better to create another Class in Tasks called PdfScanTask and then a 'pdfscan' command to run it? Much or most of the code in PdfScanCommand.cs would be the same as ImportCommand.cs. I had thought of generalizing ImportTask to make it take an option indicating what to import but that got more complex.

from dotopds.

gerritv avatar gerritv commented on May 26, 2024

Ok, upon further pondering over an espresso I modified Import Task and ImportCommand:

  • Added required option ImportType=inpx or pdf,
  • added code in ImportTask to run one of those 2 tasks. Long term it might be best to add a base class for Parser in Parsers and move inpx/pdf parsers to that directory?
    Now on to testing & debugging

from dotopds.

gerritv avatar gerritv commented on May 26, 2024

You can see my code changes so far in https://github.com/gerritv/DotOPDS. Scanning of pdf's is working, but can't get query working via Aldiko. I tried forcing all books/pdf's to have Genre other,other but wtill no joy.
so, my next question is: where can I learn about using Owin and System.Web.Http to create some different web pages for serving pages?

from dotopds.

shemanaev avatar shemanaev commented on May 26, 2024

Hey Gerrit,
genre should be it's id, not human readable string. You should pick one from list.Add("sf_history"); like instruction in Genres.cs.
And your Book model will look like this:

var args = new Book
{
    Authors = new[] { author },
    Genres = new[] { "other" },
    Title = info.Title,
    File = Path.GetFileNameWithoutExtension(fi.FullName),
    Size = (int)fi.Length,
    Ext = "pdf",
    Date = info.CreationDate,
    Language = "en",
    Keywords = info.Keywords.Split(','),
    Archive = "",
};

I've also pushed some fixes to master, you should pull it.
And there is one problem i can't figure it out yet: LuceneImporter always uses RussianAnalyzer for now, as there is neither language autodetection, nor good way to populate it on import.

from dotopds.

gerritv avatar gerritv commented on May 26, 2024

Thank you for those fixes/changes.
I now have things sort of working using FBReader. Aldiko and OPDSViewer don't like whatever is being returned.
I also need to work on File pathname as my files can be in sub directory off Library Path. Your solution above strips out the intermediate directories. My initial method was also wrong as it resulted in Library Path existing twice in the download link.

I will close this Issue as I am now well past the original question.
I would though appreciate a link or book or something where I can learn about WebApi2/Owin/Nowin in English (or Dutch)

from dotopds.

shemanaev avatar shemanaev commented on May 26, 2024

I learned WebApi 2 from official docs.
Nowin/OWIN is pretty straightforward through Nowin samples and OWIN spec.

Your solution above strips out the intermediate directories.

Yeah, I don't remember all the .net apis but you get the point 😉

from dotopds.

gerritv avatar gerritv commented on May 26, 2024

Thx, The Message LifeCycle diagram is a huge help.

Yes, I got it :-) My setup is a bit unusual.
Now trying to figure out how to make some Pull requests without feeding you my pdf solution. (It relies on DebenuPDFLite, which is a bit of a pain to install but is free). Looking at

git cherry-pick

from dotopds.

Related Issues (16)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.