Giter Site home page Giter Site logo

Comments (17)

chriswendt1 avatar chriswendt1 commented on May 28, 2024

You can use the "doctr formats" command to retrieve the list of file formats.
The output currently is:

>doctr formats
CSV     .csv
ExcelSpreadsheet        .xls
HtmlFile        .html   .htm
Markdown        .markdown       .mdown  .mkdn   .md     .mkd    .mdwn   .mdtxt  .mdtext .rmd
Mhtml   .mhtml  .mht
OpenDocumentPresentation        .odp
OpenDocumentSpreadsheet .ods
OpenDocumentText        .odt
OpenXmlPresentation     .pptx
OpenXmlSpreadsheet      .xlsx
OpenXmlWord     .docx
OutlookMailMessage      .msg
PlainText       .txt
PortableDocumentFormat  .pdf
PowerpointPresentation  .ppt
RichTextFormat  .rtf
TSV     .tsv    .tab
WordDocument    .doc
XLIFF   .xlf

You do not need to set the file format individually. File format is determined by the extension.
The list of formats is purely determined by the service.
An upcoming enhancement is a mechanism to add local pre-and post-processing for additional file formats. Top candidate is VTT/SRT file format.

from documenttranslation.

chriswendt1 avatar chriswendt1 commented on May 28, 2024

Milestone work is to add this to the README.

from documenttranslation.

parclarke avatar parclarke commented on May 28, 2024

Wow! That is a lot of formats. I was not expecting that. I read there is a 5000 character limit to the amount of text you can translate. Does that limitation apply to this project? Here is the error I am getting: doctr formats
Unhandled exception. System.NullReferenceException: Object reference not set to an instance of an object.
at TranslationService.CLI.Program.<>c.<

b__1_19>d.MoveNext() in C:\work\Source\Repos\DocumentTranslation\DocumentTranslation.CLI\Program.cs:line 253

from documenttranslation.

chriswendt1 avatar chriswendt1 commented on May 28, 2024

This is using Microsoft's document translation service, which has different request limits. The limits are listed here:
https://docs.microsoft.com/en-us/azure/cognitive-services/translator/document-translation/get-started-with-document-translation?tabs=csharp#content-limits

Thank you, I will fix this crash.
This command requires the complete set of credentials and doesn't behave well otherwise :-(

from documenttranslation.

chriswendt1 avatar chriswendt1 commented on May 28, 2024

Updated the documentation with more detailed instructions on how to obtain credentials, and how to find the supported document formats.

from documenttranslation.

parclarke avatar parclarke commented on May 28, 2024

I am still getting a Null Reference Exception when I execute doctr formats. I'm suspecting something is missing in my appsettings file Program.cs:line 253

from documenttranslation.

chriswendt1 avatar chriswendt1 commented on May 28, 2024

Did you rebuild or use a provided binary?
I just uploaded doctr.zip, compiled from current master branch, to
https://github.com/MicrosoftTranslator/DocumentTranslation/releases/edit/0.2.0.0
Previous binary was outdated.

from documenttranslation.

parclarke avatar parclarke commented on May 28, 2024

I did a Git Pull and recompiled the solution. I will try the zip file.

from documenttranslation.

chriswendt1 avatar chriswendt1 commented on May 28, 2024

That should have worked. The current code?
There is nothing particularly interesting happening around line 253, and nothing to do with formats.
image

from documenttranslation.

parclarke avatar parclarke commented on May 28, 2024

Using the new zip file under releases I'm getting the following error: List of translatable extensions cannot be null. (Parameter 'Extensions')

Using the GUI this is the error I'm getting

Capture

from documenttranslation.

chriswendt1 avatar chriswendt1 commented on May 28, 2024

This means you didn't get the enumeration of the file formats form the service.
When you "Test" your credentials, do get a pass?
doctr config test
or
"Test" button in the GUI's Authentication page.

from documenttranslation.

parclarke avatar parclarke commented on May 28, 2024

Yes I get a PASS

from documenttranslation.

chriswendt1 avatar chriswendt1 commented on May 28, 2024

That is good. Wild theories:

  • Your key is not for the Translator resource, but a different one or an All-In-One key.
  • The key is for a different Translator resource than the one you gave the name for.

The storage account shouldn't matter, don't need a storage account for enumerating the formats.

from documenttranslation.

parclarke avatar parclarke commented on May 28, 2024

My apologies. I was using a Cognitive Services key and not a Translator service key. It is working fine now. The funny thing is that when I used the Cognitive services key I could translate text using the GUI app.

from documenttranslation.

chriswendt1 avatar chriswendt1 commented on May 28, 2024

OK, thanks. No surprise that the Text Translation worked - it is less sensitive to the key type.
Bug that the "Test" let you pass.
I will repro with your scenario and then try and catch the empty response.
Thanks for the report, and let me know what else you find.

from documenttranslation.

chriswendt1 avatar chriswendt1 commented on May 28, 2024

I am getting a "Test failed" result with the Cognitive Services key.

image

image

from documenttranslation.

chriswendt1 avatar chriswendt1 commented on May 28, 2024

Thanks again for the help. I updated the code to catch the failure with a Cognitive Services key. Binaries not updated.
Let me know what else you find.

from documenttranslation.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.