jhudsl / ari Goto Github PK

View Code? Open in Web Editor NEW

147.0 147.0 37.0 2.58 MB

:dancers: The Automated R Instructor

Home Page: https://jhudatascience.org/ari/

License: Other

R 11.06% HTML 88.94%

edtech-software

ari's People

Contributors

Stargazers

Watchers

ari's Issues

path errors on win + RStudio

slides path creation has a missing slash after localhost. Moreover, if webshot:::is_windows(),
webshot calls webshot:::fix_windows_url() which appends his version of "file:///"..., hence if ari_narrate() attaches "files://localhost/" before that call, it crates a redundancy which leads to an error.

ari::ari_narrate(
  script = system.file("test", "ari_intro_script.md", package = "ari"),
  slides = system.file("test", "ari_intro.html", package = "ari"),
  voice  = "Joey"
)
#> "C:/Program Files/RStudio/bin/pandoc/pandoc" +RTS -K512m -RTS ari_intro_script.utf8.md --to html --from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash --output pandoc495415dc44b5.html --smart --email-obfuscation none --self-contained --standalone --section-divs --template "C:\Users\corra\Documents\R\win-library\3.4\rmarkdown\rmd\h\default.html" --no-highlight --variable highlightjs=1 --variable "theme:bootstrap" --include-in-header "C:\Users\corra\AppData\Local\Temp\RtmpaU6xA7\rmarkdown-str495471b03a64.html" --mathjax --variable "mathjax-url:https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"
#> 
#> Output created: C:\Users\corra\AppData\Local\Temp\RtmpOCgHzv/ari_script_FYRyVqGWrzBH.html
#> Warning in normalizePath(path.expand(path), winslash, mustWork):
#> path[1]="file://localhostC:\Users\corra\Documents\R\win-library\3.4\ari
#> \test\ari_intro.html#1": The filename, directory name, or volume label
#> syntax is incorrect
#> Warning in normalizePath(path.expand(path), winslash, mustWork):
#> path[1]="file://localhostC:\Users\corra\Documents\R\win-library\3.4\ari
#> \test\ari_intro.html#2": The filename, directory name, or volume label
#> syntax is incorrect
#> Warning in normalizePath(path.expand(path), winslash, mustWork):
#> path[1]="file://localhostC:\Users\corra\Documents\R\win-library\3.4\ari
#> \test\ari_intro.html#3": The filename, directory name, or volume label
#> syntax is incorrect
#> Warning in normalizePath(path.expand(path), winslash, mustWork):
#> path[1]="file://localhostC:\Users\corra\Documents\R\win-library\3.4\ari
#> \test\ari_intro.html#4": The filename, directory name, or volume label
#> syntax is incorrect
#> Warning in normalizePath(path.expand(path), winslash, mustWork):
#> path[1]="file://localhostC:\Users\corra\Documents\R\win-library\3.4\ari
#> \test\ari_intro.html#5": The filename, directory name, or volume label
#> syntax is incorrect
#> Warning: running command '"C:\Users\corra\AppData\Roaming/PhantomJS/
#> phantomjs.exe" "C:/Users/corra/Documents/R/win-library/3.4/webshot/
#> webshot.js" "[{\"url\":\"file:///C:/Users/corra/AppData/Local/Temp/
#> RtmpOCgHzv/file:/localhostC:/Users/corra/Documents/R/win-library/3.4/
#> ari/test/ari_intro.html#1\",\"file\":\"C:\\Users\\corra\\AppData\\Local\
#> \Temp\\RtmpOCgHzv/ari_img_1_xwpH7ZergYFH.jpeg\",\"vwidth\":992,\"vheight
#> \":744,\"delay\":0.2,\"zoom\":1},{\"url\":\"file:///C:/Users/corra/
#> AppData/Local/Temp/RtmpOCgHzv/file:/localhostC:/Users/corra/Documents/R/
#> win-library/3.4/ari/test/ari_intro.html#2\",\"file\":\"C:\\Users\\corra\
#> \AppData\\Local\\Temp\\RtmpOCgHzv/ari_img_2_xwpH7ZergYFH.jpeg\",\"vwidth
#> \":992,\"vheight\":744,\"delay\":0.2,\"zoom\":1},{\"url\":\"file:///C:/
#> Users/corra/AppData/Local/Temp/RtmpOCgHzv/file:/localhostC:/Users/corra/
#> Documents/R/win-library/3.4/ari/test/ari_intro.html#3\",\"file\":\"C:\
#> \Users\\corra\\AppData\\Local\\Temp\\RtmpOCgHzv/ari_img_3_xwpH7ZergYFH.jpeg
#> \",\"vwidth\":992,\"vheight\":744,\"delay\":0.2,\"zoom\":1},{\"url\":
#> \"file:///C:/Users/corra/AppData/Local/Temp/RtmpOCgHzv/file:/localhostC:/
#> Users/corra/Documents/R/win-library/3.4/ari/test/ari_intro.html#4\",
#> \"file\":\"C:\\Users\\corra\\AppData\\Local\\Temp\\RtmpOCgHzv/
#> ari_img_4_xwpH7ZergYFH.jpeg\",\"vwidth\":992,\"vheight\":744,\"delay\":
#> 0.2,\"zoom\":1},{\"url\":\"file:///C:/Users/corra/AppData/Local/Temp/
#> RtmpOCgHzv/file:/localhostC:/Users/corra/Documents/R/win-library/3.4/ari/
#> test/ari_intro.html#5\",\"file\":\"C:\\Users\\corra\\AppData\\Local\\Temp
#> \\RtmpOCgHzv/ari_img_5_xwpH7ZergYFH.jpeg\",\"vwidth\":992,\"vheight\":
#> 744,\"delay\":0.2,\"zoom\":1}]"' had status 1
#> Error in webshot(url = paste0(slides, "#", slide_nums), file = img_paths, : webshot.js returned failure value: 1

Session info

devtools::session_info()
#> Session info -------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.4.2 (2017-09-28)
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language (EN)                        
#>  collate  English_United States.1252  
#>  tz       Europe/Berlin               
#>  date     2017-10-08
#> Packages -----------------------------------------------------------------
#>  package       * version  date       source        
#>  ari             0.1.0    2017-08-31 CRAN (R 3.4.2)
#>  assertthat      0.2.0    2017-04-11 CRAN (R 3.4.1)
#>  aws.polly       0.1.2    2016-12-08 CRAN (R 3.4.1)
#>  aws.signature   0.3.5    2017-07-01 CRAN (R 3.4.1)
#>  backports       1.1.0    2017-05-22 CRAN (R 3.4.0)
#>  base          * 3.4.2    2017-09-28 local         
#>  base64enc       0.1-3    2015-07-28 CRAN (R 3.4.0)
#>  compiler        3.4.2    2017-09-28 local         
#>  curl            2.8.1    2017-07-21 CRAN (R 3.4.1)
#>  datasets      * 3.4.2    2017-09-28 local         
#>  devtools        1.13.3   2017-08-02 CRAN (R 3.4.1)
#>  digest          0.6.12   2017-01-27 CRAN (R 3.4.1)
#>  evaluate        0.10.1   2017-06-24 CRAN (R 3.4.1)
#>  graphics      * 3.4.2    2017-09-28 local         
#>  grDevices     * 3.4.2    2017-09-28 local         
#>  htmltools       0.3.6    2017-04-28 CRAN (R 3.4.1)
#>  httr            1.3.1    2017-08-20 CRAN (R 3.4.1)
#>  jsonlite        1.5      2017-06-01 CRAN (R 3.4.1)
#>  knitr           1.17     2017-08-10 CRAN (R 3.4.1)
#>  magrittr        1.5      2014-11-22 CRAN (R 3.4.1)
#>  MASS            7.3-47   2017-04-21 CRAN (R 3.4.1)
#>  memoise         1.1.0    2017-04-21 CRAN (R 3.4.1)
#>  methods       * 3.4.2    2017-09-28 local         
#>  prettyunits     1.0.2    2015-07-13 CRAN (R 3.4.1)
#>  progress        1.1.2    2016-12-14 CRAN (R 3.4.1)
#>  purrr           0.2.3    2017-08-02 CRAN (R 3.4.1)
#>  R6              2.2.2    2017-06-17 CRAN (R 3.4.1)
#>  Rcpp            0.12.12  2017-07-15 CRAN (R 3.4.1)
#>  rlang           0.1.2    2017-08-09 CRAN (R 3.4.1)
#>  rmarkdown       1.6      2017-06-15 CRAN (R 3.4.1)
#>  rprojroot       1.2      2017-01-16 CRAN (R 3.4.1)
#>  rvest           0.3.2    2016-06-17 CRAN (R 3.4.1)
#>  selectr         0.3-1    2016-12-19 CRAN (R 3.4.1)
#>  signal          0.7-6    2015-07-30 CRAN (R 3.4.1)
#>  stats         * 3.4.2    2017-09-28 local         
#>  stringi         1.1.5    2017-04-07 CRAN (R 3.4.0)
#>  stringr         1.2.0    2017-02-18 CRAN (R 3.4.1)
#>  tools           3.4.2    2017-09-28 local         
#>  tuneR           1.3.2    2017-04-10 CRAN (R 3.4.1)
#>  utils         * 3.4.2    2017-09-28 local         
#>  webshot         0.4.2    2017-09-25 CRAN (R 3.4.2)
#>  withr           2.0.0    2017-07-28 CRAN (R 3.4.1)
#>  XML             3.98-1.9 2017-06-19 CRAN (R 3.4.0)
#>  xml2            1.1.1    2017-01-24 CRAN (R 3.4.1)
#>  yaml            2.1.14   2016-11-12 CRAN (R 3.4.1)

Goals for the ariverse

Better documentation , vignettes on how to use it
Better marketing -- what does ari actually do?
Make the ariverse more modular and stable
Make text2speech so that other text to speech engines can be added in (akin to knitr and pandoc)
Revive https://github.com/seankross/ari-on-docker

ari_stitch() image path on win + RStudio

The call to ffmpeg concat attaches the wd to each filepath (at least on my system). I solved by reset
images <- basename(images) just before the loop on input_txt_path.

Here below the reprex after the patch I used to fix issue #2

ari::ari_narrate(
  script = system.file("test", "ari_intro_script.md", package = "ari"),
  slides = system.file("test", "ari_intro.html", package = "ari"),
  voice  = "Joey"
)
#> "C:/Program Files/RStudio/bin/pandoc/pandoc" +RTS -K512m -RTS ari_intro_script.utf8.md --to html --from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash --output pandoc1c0c414b4119.html --smart --email-obfuscation none --self-contained --standalone --section-divs --template "C:\Users\corra\Documents\R\win-library\3.4\rmarkdown\rmd\h\default.html" --no-highlight --variable highlightjs=1 --variable "theme:bootstrap" --include-in-header "C:\Users\corra\AppData\Local\Temp\RtmpCiyuCW\rmarkdown-str1c0c69c45021.html" --mathjax --variable "mathjax-url:https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"
#> 
#> Output created: C:\Users\corra\AppData\Local\Temp\Rtmpi8VlWj/ari_script_Sr4gMM78ULl4.html
#> Warning in normalizePath(path.expand(path), winslash, mustWork):
#> path[1]="C:\Users\corra\Documents\R\win-library\3.4\ari\test
#> \ari_intro.html#1": The system cannot find the file specified
#> Warning in normalizePath(path.expand(path), winslash, mustWork):
#> path[1]="C:\Users\corra\Documents\R\win-library\3.4\ari\test
#> \ari_intro.html#2": The system cannot find the file specified
#> Warning in normalizePath(path.expand(path), winslash, mustWork):
#> path[1]="C:\Users\corra\Documents\R\win-library\3.4\ari\test
#> \ari_intro.html#3": The system cannot find the file specified
#> Warning in normalizePath(path.expand(path), winslash, mustWork):
#> path[1]="C:\Users\corra\Documents\R\win-library\3.4\ari\test
#> \ari_intro.html#4": The system cannot find the file specified
#> Warning in normalizePath(path.expand(path), winslash, mustWork):
#> path[1]="C:\Users\corra\Documents\R\win-library\3.4\ari\test
#> \ari_intro.html#5": The system cannot find the file specified
#> Warning: running command 'C:\ffmpeg\bin\ffmpeg.exe -y -f concat -safe 0 -
#> i C:\Users\corra\AppData\Local\Temp\Rtmpi8VlWj/ari_input_LcPVQ5rUzxVj.txt -
#> i C:\Users\corra\AppData\Local\Temp\Rtmpi8VlWj/ari_audio_F5L1gjRHbCnw.wav
#> -c:v libx264 -c:a aac -b:a 192k -shortest -vsync vfr -pix_fmt yuv420p
#> output.mp4' had status 1

Session info

devtools::session_info()
#> Session info -------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.4.2 (2017-09-28)
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language (EN)                        
#>  collate  English_United States.1252  
#>  tz       Europe/Berlin               
#>  date     2017-10-08
#> Packages -----------------------------------------------------------------
#>  package       * version  date       source        
#>  ari             0.1.0    2017-10-07 local         
#>  assertthat      0.2.0    2017-04-11 CRAN (R 3.4.1)
#>  aws.polly       0.1.2    2016-12-08 CRAN (R 3.4.1)
#>  aws.signature   0.3.5    2017-07-01 CRAN (R 3.4.1)
#>  backports       1.1.0    2017-05-22 CRAN (R 3.4.0)
#>  base          * 3.4.2    2017-09-28 local         
#>  base64enc       0.1-3    2015-07-28 CRAN (R 3.4.0)
#>  compiler        3.4.2    2017-09-28 local         
#>  curl            2.8.1    2017-07-21 CRAN (R 3.4.1)
#>  datasets      * 3.4.2    2017-09-28 local         
#>  devtools        1.13.3   2017-08-02 CRAN (R 3.4.1)
#>  digest          0.6.12   2017-01-27 CRAN (R 3.4.1)
#>  evaluate        0.10.1   2017-06-24 CRAN (R 3.4.1)
#>  graphics      * 3.4.2    2017-09-28 local         
#>  grDevices     * 3.4.2    2017-09-28 local         
#>  htmltools       0.3.6    2017-04-28 CRAN (R 3.4.1)
#>  httr            1.3.1    2017-08-20 CRAN (R 3.4.1)
#>  jsonlite        1.5      2017-06-01 CRAN (R 3.4.1)
#>  knitr           1.17     2017-08-10 CRAN (R 3.4.1)
#>  magrittr        1.5      2014-11-22 CRAN (R 3.4.1)
#>  MASS            7.3-47   2017-04-21 CRAN (R 3.4.1)
#>  memoise         1.1.0    2017-04-21 CRAN (R 3.4.1)
#>  methods       * 3.4.2    2017-09-28 local         
#>  prettyunits     1.0.2    2015-07-13 CRAN (R 3.4.1)
#>  progress        1.1.2    2016-12-14 CRAN (R 3.4.1)
#>  purrr           0.2.3    2017-08-02 CRAN (R 3.4.1)
#>  R6              2.2.2    2017-06-17 CRAN (R 3.4.1)
#>  Rcpp            0.12.12  2017-07-15 CRAN (R 3.4.1)
#>  rlang           0.1.2    2017-08-09 CRAN (R 3.4.1)
#>  rmarkdown       1.6      2017-06-15 CRAN (R 3.4.1)
#>  rprojroot       1.2      2017-01-16 CRAN (R 3.4.1)
#>  rvest           0.3.2    2016-06-17 CRAN (R 3.4.1)
#>  selectr         0.3-1    2016-12-19 CRAN (R 3.4.1)
#>  signal          0.7-6    2015-07-30 CRAN (R 3.4.1)
#>  stats         * 3.4.2    2017-09-28 local         
#>  stringi         1.1.5    2017-04-07 CRAN (R 3.4.0)
#>  stringr         1.2.0    2017-02-18 CRAN (R 3.4.1)
#>  tools           3.4.2    2017-09-28 local         
#>  tuneR           1.3.2    2017-04-10 CRAN (R 3.4.1)
#>  utils         * 3.4.2    2017-09-28 local         
#>  webshot         0.4.2    2017-09-25 CRAN (R 3.4.2)
#>  withr           2.0.0    2017-07-28 CRAN (R 3.4.1)
#>  XML             3.98-1.9 2017-06-19 CRAN (R 3.4.0)
#>  xml2            1.1.1    2017-01-24 CRAN (R 3.4.1)
#>  yaml            2.1.14   2016-11-12 CRAN (R 3.4.1)

While running not for reprex, the error is more informative and, in particular the following is the output of my consol

ffmpeg version 3.3.3 Copyright (c) 2000-2017 the FFmpeg developers
  built with gcc 7.1.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-cuda --enable-cuvid --enable-d3d11va --enable-dxva2 --enable-libmfx --enable-nvenc --enable-avisynth --enable-bzlib --enable-fontconfig --enable-frei0r --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libfreetype --enable-libgme --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg --enable-libopus --enable-librtmp --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs --enable-libxvid --enable-libzimg --enable-lzma --enable-zlib
  libavutil      55. 58.100 / 55. 58.100
  libavcodec     57. 89.100 / 57. 89.100
  libavformat    57. 71.100 / 57. 71.100
  libavdevice    57.  6.100 / 57.  6.100
  libavfilter     6. 82.100 /  6. 82.100
  libswscale      4.  6.100 /  4.  6.100
  libswresample   2.  7.100 /  2.  7.100
  libpostproc    54.  5.100 / 54.  5.100
[concat @ 00000000006128a0] Impossible to open 'C:\Users\corra\OneDrive\miscR\ari_video_slide/C:\Users\corra\OneDrive\miscR\ari_video_slide\ari_img_1_L810ChiVwf14p.jpeg'
C:\Users\corra\OneDrive\miscR\ari_video_slide/ari_input_mKfvdbvsyVSq.txt: Invalid argument

As you can see the filepath is repeated twice.

Roadmap to 0.4.1

Add user facing messages with https://github.com/r-lib/cli (replace print())

subtitles

I wonder if the function "ari_burn_subtitles" is working. I can burn in a very clumsy way subtitles with the bash, but when I try to run that function I get this Error in RStudio: could not find function "ari_burn_subtitles"

AppVeyor Error for Windows

Windows doesn't like to use download.file() : https://api.github.com/repos/cloudyr/aws.signature/contents/DESCRIPTION?ref=HEAD
https://ci.appveyor.com/project/muschellij2/ari-j1o6s/builds/42779137

package 'remotes' successfully unpacked and MD5 sums checked
The downloaded binary packages are in
	C:\Users\appveyor\AppData\Local\Temp\1\RtmpkZOg3y\downloaded_packages
+ return
+ Rscript -e 'if (!("remotes" %in% rownames(installed.packages()))) q(status=1)'
+ echo 'Installing dependencies'
Installing dependencies
+ Rscript -e 'options(repos = c(CRAN = "https://cloud.r-project.org"), download.file.method = "auto"); remotes::install_deps(dependencies = TRUE, type="win.binary")'
Error in utils::download.file(url, path, method = method, quiet = quiet,  : 
  cannot open URL 'https://api.github.com/repos/cloudyr/aws.signature/contents/DESCRIPTION?ref=HEAD'
Calls: <Anonymous> ... github_DESCRIPTION -> download -> base_download -> base_download_headers
Execution halted
Command exited with code 1
7z a failure.zip *.Rcheck\*
7-Zip 19.00 (x64) : Copyright (c) 1999-2018 Igor Pavlov : 2019-02-21

Beamer presentation compatibility

Is there a way to have beamer presentations narrated by ari?

Skip text2speech-ing slides without speaker notes

Idea from @salernos when using Loqui:

I wonder if there is a way to modify this so slides without speaker notes can be included?
I plan on filling on all of the slides with speaker notes, but for things like appendix slides, I wonder if there is some skip logic that could be added? I wonder if a general solution (for more than just slides without any speaker notes) would have some kind of pattern matching with regex, like {#} in your speaker notes where you can replace the # with numeric value that dictates how long a pause should be. So in the middle of a block of text, you could have a 3 second pause by: "Here is something I want to say with a dramatic pause.{3} Here is where I continue"
And for slides without speaker notes, you could just have "{10}" or something to give 10 seconds of pause to that slide

Notes from meeting with Sean

dev is our development branch. We merge all our changes into dev and when we ship to CRAN, we merge dev into main and then ship main to CRAN.
Roadmap issue will have checkbox, one PR per checkbox, and tag which PR addresses that checkbox.

Determine modular packages from the functions in #46

#46 Is a big overhaul. We need to take some time to think about how we can split up the functions there into more modular packages.

This means we need to be able to answer what the main workflows and use cases are for ari and any resulting packages.

So far it sounds like ari's main functionality is making videos but what are the main steps one would take to get there?

Refactoring into API and GUI versions of ari implementation

See this powerpoint: https://docs.google.com/presentation/d/1Vjvq7PYuWsTkGi2EkXpnk0KtQYhbPSidBhMFQcqyb8I/edit

The idea is that mario will be the API version of a video maker while loqui will be the GUI version of a video maker. Both will be powered in the back end by functions here in ari.

Steps involved with this:

Add free version of tts-coqui to these text2speech and ari functions
Make GUI in loqui repo
Update mario repo for deployment to Hutch servers
Figure out billing for mario and loqui for when people want to use fancier tts options like Polly

ffmpeg install issue?

Loving ari! One thing I had trouble with was ffmpeg. May be best to suggest using brew for ffmpeg installation? I installed from source initially (version was 3.4.1). When I then used brew install ffmpeg, no longer got an error from ari. (Sorry I don't have the error message I got to actually show you the issue...I'm the worst.)

ari_stitch example not working

I'm running Ubuntu Studio 20.10. I installed both ari 0.3.5 (the latest off CRAN) and ffmpeg. All I did was try to run the simple ari_stitch example from the readme:

 ari_stitch(
   ari_example(c("mab1.png", "mab2.png")),
   list(noise(), noise()))

But I did not get a resultant video file. This is what I saw in the console:

ffmpeg version 4.3.1-4ubuntu1 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 10 (Ubuntu 10.2.0-9ubuntu2)
  configuration: --prefix=/usr --extra-version=4ubuntu1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  WARNING: library configuration mismatch
  avcodec     configuration: --prefix=/usr --extra-version=4ubuntu1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared --enable-version3 --disable-doc --disable-programs --enable-libaribb24 --enable-liblensfun --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libtesseract --enable-libvo_amrwbenc
  libavutil      56. 51.100 / 56. 51.100
  libavcodec     58. 91.100 / 58. 91.100
  libavformat    58. 45.100 / 58. 45.100
  libavdevice    58. 10.100 / 58. 10.100
  libavfilter     7. 85.100 /  7. 85.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  7.100 /  5.  7.100
  libswresample   3.  7.100 /  3.  7.100
  libpostproc    55.  7.100 / 55.  7.100
Input #0, concat, from 'ari_input_QKmW6seOAMcH.txt':
  Duration: N/A, start: 0.000000, bitrate: N/A
    Stream #0:0: Video: png, rgba(pc), 480x480, 25 tbr, 25 tbn, 25 tbc
Input #1, wav, from '/tmp/RtmpPa3t1d/ari_audio_48BDjSC6Spu4.wav':
  Duration: 00:00:02.00, bitrate: 1411 kb/s
    Stream #1:0: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 44100 Hz, 1 channels (FL), flt, 1411 kb/s
Multiple -filter, -af or -vf options specified for stream 0, only the last option '-filter:v scale=trunc(iw/2)*2:trunc(ih/2)*2' will be used.
Stream mapping:
  Stream #0:0 -> #0:0 (png (native) -> h264 (libx264))
  Stream #1:0 -> #0:1 (pcm_f32le (native) -> aac (native))
Press [q] to stop, [?] for help
[libx264 @ 0x555e595670c0] using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2
[libx264 @ 0x555e595670c0] profile High, level 3.0, 4:2:0, 8-bit
[libx264 @ 0x555e595670c0] 264 - core 160 r3011 cde9a93 - H.264/MPEG-4 AVC codec - Copyleft 2003-2020 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=3 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to '/tmp/RtmpPa3t1d/file47e92baea8a4.mp4':
  Metadata:
    encoder         : Lavf58.45.100
    Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661), yuv420p(progressive), 480x480, q=-1--1, 25 fps, 12800 tbn, 25 tbc
    Metadata:
      encoder         : Lavc58.91.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
    Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 128 kb/s
    Metadata:
      encoder         : Lavc58.91.100 aac
[mp4 @ 0x555e59565b80] Starting second pass: moving the moov atom to the beginning of the file
frame=    3 fps=0.0 q=-1.0 Lsize=      39kB time=00:00:02.02 bitrate= 156.2kbits/s speed=37.9x    
video:5kB audio:31kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 4.825809%
[libx264 @ 0x555e595670c0] frame I:1     Avg QP:11.04  size:  3231
[libx264 @ 0x555e595670c0] frame P:1     Avg QP:18.21  size:  1516
[libx264 @ 0x555e595670c0] frame B:1     Avg QP:15.37  size:   125
[libx264 @ 0x555e595670c0] consecutive B-frames: 33.3% 66.7%  0.0%  0.0%
[libx264 @ 0x555e595670c0] mb I  I16..4: 50.3% 41.3%  8.3%
[libx264 @ 0x555e595670c0] mb P  I16..4:  1.1%  2.7%  3.6%  P16..4:  0.4%  0.1%  0.0%  0.0%  0.0%    skip:92.1%
[libx264 @ 0x555e595670c0] mb B  I16..4:  0.1%  0.0%  0.0%  B16..8:  9.0%  0.4%  0.0%  direct: 0.0%  skip:90.4%  L0:19.4% L1:80.0% BI: 0.6%
[libx264 @ 0x555e595670c0] 8x8 transform intra:41.0% inter:0.0%
[libx264 @ 0x555e595670c0] coded y,uvDC,uvAC intra: 5.3% 0.0% 0.0% inter: 0.2% 0.0% 0.0%
[libx264 @ 0x555e595670c0] i16 v,h,dc,p: 83% 11%  6%  0%
[libx264 @ 0x555e595670c0] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 75%  2% 23%  0%  0%  0%  0%  0%  0%
[libx264 @ 0x555e595670c0] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 41% 14% 33%  2%  3%  2%  2%  2%  2%
[libx264 @ 0x555e595670c0] i8c dc,h,v,p: 100%  0%  0%  0%
[libx264 @ 0x555e595670c0] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0x555e595670c0] kb/s:12.99
[aac @ 0x555e59568740] Qavg: 1001.039

Use of SSML

I tried to use SSML when having my texts spoken by Amazon Polly.
Unfortunately, it does not work. See the example below - I tried both with and without .

Does anybody have an idea?

Thanks.

slides <- system.file("test", c("mab2.png", "mab1.png"),
                      package = "ari")
sentences <- c("Welcome to my very <s> interesting lecture.",
               "<speak>Here are some fantastic <s> equations I came up with.</speak>")
ari_spin(slides, sentences, voice = "Brian", output = "Videos/slides_test_test.mp4")

Error when working from directory with spaces

When calling ari_narrate() from a directory that contains whitespace, the ffmpeg command fails with the error:
/Users/jules/tmp/with: No such file or directory

Working directory is:
> getwd()
[1] "/Users/jules/tmp/with space"

tmpfile ari_input_Xrz...txt reads

file '/Users/jules/tmp/with space/ari_img_1_a6KgA7H82gfs.jpeg' duration 14 file '/Users/jules/tmp/with space/ari_img_2_a6KgA7H82gfs.jpeg' duration 6 file '/Users/jules/tmp/with space/ari_img_2_a6KgA7H82gfs.jpeg'

Support for Neural voice engines on Amazon Polly

I find some of the Amazon Polly voices are a little too fast for non-native English speakers to understand easily, but the Neural version of those same voices is that little bit slower and seems to work better for them.

Therefore, I've been trying to determine if there's a way to specify the Neural engine for the voice I've chosen but I'm not sure this is possible at present. Is it?

Roadmap to 0.4.0

get rid of text2speech specific code
specify/document voice engine API (#57)
generally reduce number of arguments in functions (especially relating to ffmpeg)
write a vignette about what kinds of args to pass to ffmpeg
See if the burning subtitles functions work.
Should write some new tests for Ari that use tts coqui since it's free. A really cool test would go from tts to stt. (likely these only get run locally and not on cran)

Bonus:

spin out ariExtra functions into separate utility packages

SSML for AWS POLLY

Great package. Is there any plans to allow for SSML? There might be more to this, but it seems like it's just adding ",..." at the end of this line in ari_spin or creating a specific list as a parameter to be passed into ari_spin and forwarded to tts.

wav <- text2speech::tts(text = paragraphs[i], voice = voice,
service = service, bind_audio = TRUE)

Voice Cloning

For purposes of the prototype of loqui-vc, hack together an ari_spin_vc() function that incorporates the voice-cloned audio file that is generated by the newly written code in jhudsl/text2speech#40.

path error

Hi,
I'l getting this error when running ari.
ari_narrate("Introduction_to_R.Rmd","Introduction_to_R.html",voice = "Kendra", delay = 0.5, capture_method = "iterative")
Could not load file:///C:/Documents/file:/localhost/C:/Documents/Introduction_to_R.html#1
Error in webshot(url = paste0(slides, "#", i), file = img_paths[i], ...) :
webshot.js returned failure value: 1
In addition: Warning message:
In normalizePath(path.expand(path), winslash, mustWork) :
path[1]="file://localhost/C:\Documents\Introduction_to_R.html#1": The filename, directory name, or volume label syntax is incorrect

I'm using windows. Is there a way to fix it?

Thanks a lot.

Unable to use animations on xaringan presentations

I was able to use xarigan to generate my videos, but when I add slide transitions or animated gifs to the presentation, they seam to be ignored by ari and are not replicated to the video.

Slide transitions and gifs are working perfectly in the HTML presentation.

To generate animated plots I used gganimate and for slide transitions I followed the steps suggested in this link: https://www.garrickadenbuie.com/blog/animate-xaringan-slide-transitions/

Do I need to set any specific parameter on ari for the animations to work or it doesn't currently support animations?

Pause between slides

Can we adda pause between slides, or some parameter for that?

Using ffmpeg 5.0.1 ari_stich fails to stich

Using ffmpeg 5.0.1 ari_stich fails to stich Images and wav because it uses vsync 2 which is no longer supported as option
Instead of vsync 2, when working with version of ffmpeg that doesnt support numbers as parameters should use "vfr" parameter.

Example

when running:

ari_narrate(output = "test.mp4", service = "google",
            cleanup = F, capture_method = "iterative",
            script = "ari_comments.Rmd",
  voice = "en-US-Standard-B")

Console output says:

`"Passing a number to -vsync is deprecated, use a string argument as described in the manual."

And video is created but images and sound are no sync.

I could fix it changing vsync parameter option to "vfr" in source code of ari_stich function beacause it is not a parameter option to pass direct into ari_narrate.

Maybe the solution is just add an video_sync_method parameter to ari_narrate to inject to the ari_stich call.

Is there an example video?

The readme has example code to generate a video, but it is missing an example of the finished video.

I would like to see the quality of the final output before trying this out.

Cannot sign in R studio

Thanks for interesting attempt.

I tried to follow the vignette of ari but I couldn't sign in R studio server with username/pass written in the vignette.
https://cran.r-project.org/web/packages/ari/vignettes/Simple-Ari-Configuration-with-Docker.html
Could you check that?

Integrate open-source Text-to-Speech synthesizers

ARI is a great package, but I don't love the fact that its speech synthesis is dependent on Amazon. Not only does it complicate the setup process, it makes ARI less open source.

Adding an option for open-source speech synthesizers like eSpeak, Mycroft and/or MARYTTS would address this issue. They can all be easily downloaded and compiled on any OS, and have open licenses. Granted, I don't know how easy they are to call from R, but since they're written in C, I don't imagine it's terribly difficult.

Not all the voice options are great, but some are certainly adequate (at least in English). Adding additional mbrola voices could potentially help.

Ffmpeg error with divisibility

Running

library(ari)
library(aws.polly)

aws.signature::use_credentials(profile = "polly")

files = list.files(pattern = ".png$",
                   full.names = TRUE)
files = path.expand(files)
para = readLines("script.txt")
para = para[ !para %in% ""]
ari_spin(paragraphs = para,
         images = files, output = "joey.mp4",
         voice = "Joey")

and getting

[libx264 @ 0x7fe0e2811600] height not divisible by 2 (3000x1687)
Error initializing output stream 0:0 -- Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height

and it seems like you can add -vf "scale=trunc(iw/2)*2:trunc(ih/2)*2" to ffmpeg as per https://stackoverflow.com/questions/20847674/ffmpeg-libx264-height-not-divisible-by-2/29582287

script.txt

Slides not advancing?

First made the mistake of using an pre-existing and cut down reveal.js presentation, then rebuilt a brand new one from stock ioslides template.

Have checked all spacing and notation against the repo examples and can't get slides to advance regardless of whether I use the .md or .rmd versions. TIA

title: "test2"
author: "foo"
date: "May 22, 2018"
output: ioslides_presentation

R Markdown

This is an R Markdown presentation. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

Slide with Bullets

Bullet 1
Bullet 2
Bullet 3

Slide with R Output

summary(cars)

Slide with Plot

plot(pressure)

Bug in `ari_stitch()`

When I run ari_stitch() on the images/audio generated from this test set of Google slides, I get a video file where the last slide is missing.

For example, if I have 6 PNG images and 6 WAV audio files, I only get 5 of them weaved in the final output.

Reproducible Example:

utils.R

pad_wav <- function(wav, duration = NULL) {
  # See if wav inherits from "Wave" class
  is_Wave <- inherits(wav, "Wave")
  if (is_Wave) {
    wav <- list(wav)
  }
  if (is.null(duration)) {
    duration <- rep(NA, length(wav))
  }
  stopifnot(length(duration) == length(wav))
  # Iterate over wav and find "ideal duration"
  duration <- map2_int(.x = wav, .y = duration,
                       .f = function(wav, dur) {
                         ideal_duration <- ceiling(length(wav@left) / [email protected])
                         if (!is.na(dur)) {
                           ideal_duration <- max(ideal_duration, dur)
                         }
                         ideal_duration
                       })
  # Iterate over wav and create end_wav that binds to existing wav
  out_wav <- map2(.x = wav,
                  .y = duration,
                  .f = function(wav, ideal_duration) {
                    left <- rep(0, [email protected] * ideal_duration - length(wav@left))
                    right <- numeric(0)
                    if (wav@stereo) {
                      right <- left
                    }
                    end_wav <- tuneR::Wave(
                      left = left,
                      right = right,
                      bit = wav@bit,
                      samp.rate = [email protected], 
                      pcm = wav@pcm
                    )
                    wav <- tuneR::bind(wav, end_wav)
                    wav
                  })
  
  if (is_Wave) {
    out_wav <- out_wav[[1]]
  }
  
  return(out_wav)
}


match_sample_rate <- function(audio, verbose = TRUE) {
  if (inherits(audio, "Wave")) {
    return(audio)
  }
  # iterate over audio and extract sampling rate
  sample_rate <- sapply(audio, function(r) [email protected])
  
  if (!all(sample_rate == sample_rate[[1]]) && verbose) {
    message("enforcing same sample rate, using minimum")
  }
  # get minimum sampling rate
  sample_rate <- min(sample_rate, na.rm = TRUE)
  if (verbose) {
    message(paste0("Sample rate downsampled to ", sample_rate))
  }
  # downsample wave object to sample_rate
  audio <- lapply(audio, function(x) {
    if ([email protected] == sample_rate) {
      return(x)
    }
    tuneR::downsample(x, samp.rate = sample_rate)
  })
  # iterate over audio and extract out sampling rate 
  sample_rate <- sapply(audio, function(r) [email protected])
  # check if all the values in sample_rate are equal to the first value in sample_rate
  stopifnot(all(sample_rate == sample_rate[[1]]))
  
  return(audio)
}

# get random string
get_random_string <- function() {
  paste(sample(c(seq(10), letters, LETTERS),
               size = 12, replace = TRUE
  ), collapse = "")
}

wav_length <- function(wav) {
  stopifnot(is_Wave(wav))
  length(wav@left) / [email protected]
}

is_Wave <- function(x) {
  identical(suppressWarnings(as.character(class(x))), "Wave")
}

set_encoders.R

get_os <- function() {
  sys_info <- Sys.info()
  os <- tolower(sys_info[["sysname"]])
  return(os)
}

#' Set Default Audio and Video Codecs
#'
#' @param codec The codec to use or get for audio/video.  Uses the
#' `ffmpeg_audio_codec` and `ffmpeg_video_codec` options
#' to store this information.
#' @seealso [ffmpeg_codecs()] for options
#' @return A `NULL` output
#'
#'
#' @rdname codecs
#' @export
#'
#' @examples
#' \dontrun{
#' if (have_ffmpeg_exec()) {
#'   print(ffmpeg_version())
#'   get_audio_codec()
#'   set_audio_codec(codec = "libfdk_aac")
#'   get_audio_codec()
#'   set_audio_codec(codec = "aac")
#'   get_audio_codec()
#' }
#' if (have_ffmpeg_exec()) {
#'   get_video_codec()
#'   set_video_codec(codec = "libx265")
#'   get_video_codec()
#'   set_video_codec(codec = "libx264")
#'   get_video_codec()
#' }
#' ## empty thing
#' if (have_ffmpeg_exec()) {
#'   video_codec_encode("libx264")
#'
#'   audio_codec_encode("aac")
#' }
#' }
set_audio_codec <- function(codec) {
  if (missing(codec)) {
    os <- get_os()
    codec <- switch(os,
                    darwin = "libfdk_aac",
                    windows = "ac3",
                    linux = "aac"
    )
  }
  options(ffmpeg_audio_codec = codec)
}

#' @export
#' @rdname codecs
set_video_codec <- function(codec = "libx264") {
  options(ffmpeg_video_codec = codec)
}

#' @export
#' @rdname codecs
get_audio_codec <- function() {
  codec <- getOption("ffmpeg_audio_codec")
  if (is.null(codec)) {
    os <- get_os()
    res <- ffmpeg_audio_codecs()
    if (is.null(res)) {
      fdk_enabled <- FALSE
    } else {
      fdk_enabled <- grepl("fdk", res[res$codec == "aac", "codec_name"])
    }
    if (fdk_enabled) {
      os_audio_codec <- "libfdk_aac"
    } else {
      os_audio_codec <- "aac"
    }
    codec <- switch(os,
                    darwin = os_audio_codec,
                    windows = "ac3",
                    linux = "aac"
    )
    set_audio_codec(codec = codec)
  }
  return(codec)
}

#' @export
#' @rdname codecs
get_video_codec <- function() {
  codec <- getOption("ffmpeg_video_codec")
  if (is.null(codec)) {
    codec <- "libx264"
    set_video_codec(codec = codec)
  }
  return(codec)
}


#' @rdname codecs
#' @export
audio_codec_encode <- function(codec) {
  res <- ffmpeg_audio_codecs()
  if (is.null(res)) {
    warning("Codec could not be checked")
    return(NA)
  }
  stopifnot(length(codec) == 1)
  res <- res[res$codec %in% codec |
               grepl(codec, res$codec_name), ]
  res$encoding_supported
}

#' @rdname codecs
#' @export
video_codec_encode <- function(codec) {
  res <- ffmpeg_video_codecs()
  if (is.null(res)) {
    warning("Codec could not be checked")
    return(NA)
  }
  stopifnot(length(codec) == 1)
  res <- res[res$codec %in% codec |
               grepl(codec, res$codec_name), ]
  res$encoding_supported
}

ffmpeg_codecs.R

#' Get Codecs for ffmpeg
#'
#' @return A `data.frame` of codec names and capabilities
#' @export
#'
#' @examples
#' \dontrun{
#' if (ffmpeg_version_sufficient()) {
#'   ffmpeg_codecs()
#'   ffmpeg_video_codecs()
#'   ffmpeg_audio_codecs()
#' }
#' }
ffmpeg_codecs <- function() {
  ffmpeg <- ari::ffmpeg_exec(quote = TRUE)
  cmd <- paste(ffmpeg, "-codecs")
  result <- system(cmd, ignore.stderr = TRUE, ignore.stdout = TRUE)
  res <- system(cmd, intern = TRUE, ignore.stderr = TRUE)
  res <- trimws(res)
  if (length(res) == 0) {
    res <- ""
  }
  if (result != 0 & all(res %in% "")) {
    warning("No codecs output from ffmpeg for codecs")
    return(NULL)
  }
  # extract elements of res that start with either a period or "D"
  res <- res[grepl("^([.]|D)", res)]
  # split by " "
  res <- strsplit(res, " ")
  res <- t(vapply(res, function(x) {
    # trims any leading or trailing white space
    x <- trimws(x)
    # removes any empty strings in each element
    x <- x[x != ""]
    # concatenate the elements that come after the second
    if (length(x) >= 3) {
      x[3:length(x)] <- paste(x[3:length(x)], collapse = " ")
    }
    # return the first 3 elements
    return(x[seq(3)])
  }, FUN.VALUE = character(3)))
  # name the 3 columns
  colnames(res) <- c("capabilities", "codec", "codec_name")
  # convert matrix to dataframe
  res <- as.data.frame(res, stringsAsFactors = FALSE)
  
  if (nrow(res) == 0) {
    warning("No codecs output from ffmpeg for codecs")
    return(NULL)
  }
  res$capabilities <- trimws(res$capabilities)
  
  cap_defns <- res[res$codec == "=", ]
  res <- res[res$codec != "=", ]
  # split each character and rbind 
  cap <- do.call("rbind", strsplit(res$capabilities, split = ""))
  
  cap_defns$codec_name <- tolower(cap_defns$codec_name)
  cap_defns$codec_name <- gsub(" ", "_", cap_defns$codec_name)
  cap_defns$codec_name <- gsub("-", "_", cap_defns$codec_name)
  cap_def <- do.call("rbind", strsplit(cap_defns$capabilities, split = ""))
  
  # create NA matrix
  mat <- matrix(NA, ncol = nrow(cap_defns), nrow = nrow(cap))
  colnames(mat) <- cap_defns$codec_name
  
  icol <- 4
  indices <- apply(cap_def, 1, function(x) which(x != "."))
  # output: vector of indices corresponding to non-"." values in each row of cap_def.
  for (icol in seq(nrow(cap_def))) {
    x <- cap[, indices[icol]]
    mat[, icol] <- x %in% cap_def[icol, indices[icol]]
  }
  mat <- as.data.frame(mat, stringsAsFactors = FALSE)
  
  res <- cbind(res, mat)
  if (any(rowSums(
    res[, c("video_codec", "audio_codec", "subtitle_codec")]
  )
  > 1)) {
    warning("Format may have changed, please post this issue")
  }
  
  # L = list(capabilities = cap_defns,
  #          codecs = res)
  # return(L)
  return(res)
}

#' @rdname ffmpeg_codecs
#' @export
ffmpeg_video_codecs <- function() {
  res <- ffmpeg_codecs()
  if (is.null(res)) {
    return(NULL)
  }
  res <- res[res$video_codec, ]
  res$video_codec <- NULL
  res$audio_codec <- NULL
  res$subtitle_codec <- NULL
  res
}

#' @rdname ffmpeg_codecs
#' @export
ffmpeg_audio_codecs <- function() {
  res <- ffmpeg_codecs()
  if (is.null(res)) {
    return(NULL)
  }
  res <- res[res$audio_codec, ]
  res$video_codec <- NULL
  res$audio_codec <- NULL
  res$subtitle_codec <- NULL
  res
}



#' @rdname ffmpeg_codecs
#' @export
ffmpeg_muxers <- function() {
  ffmpeg <- ffmpeg_exec(quote = TRUE)
  cmd <- paste(ffmpeg, "-muxers")
  result <- system(cmd, ignore.stderr = TRUE, ignore.stdout = TRUE)
  res <- system(cmd, intern = TRUE, ignore.stderr = TRUE)
  res <- trimws(res)
  if (length(res) == 0) {
    res <- ""
  }
  if (result != 0 & all(res %in% "")) {
    warning("No codecs output from ffmpeg for muxers")
    return(NULL)
  }
  res <- res[grepl("^E", res)]
  res <- strsplit(res, " ")
  res <- t(vapply(res, function(x) {
    x <- trimws(x)
    x <- x[x != ""]
    if (length(x) >= 3) {
      x[3:length(x)] <- paste(x[3:length(x)], collapse = " ")
    }
    return(x[seq(3)])
  }, FUN.VALUE = character(3)))
  colnames(res) <- c("capabilities", "muxer", "muxer_name")
  res <- as.data.frame(res, stringsAsFactors = FALSE)
  if (nrow(res) == 0) {
    warning("No codecs output from ffmpeg for muxers")
    return(NULL)
  }
  res$capabilities <- trimws(res$capabilities)
  
  return(res)
}

#' @rdname ffmpeg_codecs
#' @export
ffmpeg_version <- function() {
  ffmpeg <- ffmpeg_exec(quote = TRUE)
  cmd <- paste(ffmpeg, "-version")
  result <- system(cmd, ignore.stderr = TRUE, ignore.stdout = TRUE)
  res <- system(cmd, intern = TRUE, ignore.stderr = TRUE)
  res <- trimws(res)
  if (length(res) == 0) {
    res <- ""
  }
  if (result != 0 & all(res %in% "")) {
    warning("No codecs output from ffmpeg for version")
    return(NULL)
  }
  res <- res[grepl("^ffmpeg version", res)]
  res <- sub("ffmpeg version (.*) Copyright .*", "\\1", res)
  res <- sub("(ubuntu|debian).*", "", res)
  res <- sub("-.*", "", res)
  res <- sub("[+].*", "", res)
  res <- trimws(res)
  return(res)
}

#' @rdname ffmpeg_codecs
#' @export
ffmpeg_version_sufficient <- function() {
  if (have_ffmpeg_exec()) {
    ver <- package_version("3.2.4")
    ff_ver <- ffmpeg_version()
    if (is.null(ff_ver)) {
      warning(paste0(
        "Cannot get ffmpeg version from ",
        "ffmpeg_version, returning FALSE"
      ))
      return(FALSE)
    }
    ff_ver_char <- ff_ver
    ff_ver <- package_version(ff_ver, strict = FALSE)
    if (is.na(ff_ver)) {
      warning(
        paste0(
          "ffmpeg version is not parsed, probably a development version,",
          "version was ", ff_ver_char, ", make sure you have >= ",
          as.character(ver)
        )
      )
      return(TRUE)
    }
    res <- ff_ver >= ver
  } else {
    res <- FALSE
  }
  res
}

#' @rdname ffmpeg_codecs
#' @export
check_ffmpeg_version <- function() {
  if (!ffmpeg_version_sufficient()) {
    ff <- ffmpeg_version()
    stop(paste0(
      "ffmpeg version is not high enough,",
      " ffmpeg version is: ", ff
    ))
  }
  return(invisible(NULL))
}

Setup (Source above three R scripts and define `ari_stitch()`

library(ariExtra)
library(googledrive)
library(pdftools)
library(purrr)


source("utils.R")
source("set_encoders.R")
source("ffmpeg_codecs.R")

## Weave audio and images together
ari_stitch <- function(images, 
                       audio,
                       output = tempfile(fileext = ".mp4"),
                       verbose = FALSE,
                       cleanup = TRUE,
                       ffmpeg_opts = "",
                       divisible_height = TRUE,
                       audio_codec = get_audio_codec(),
                       video_codec = get_video_codec(),
                       video_sync_method = "2",
                       audio_bitrate = NULL,
                       video_bitrate = NULL,
                       pixel_format = "yuv420p",
                       fast_start = FALSE,
                       deinterlace = FALSE,
                       stereo_audio = TRUE,
                       duration = NULL,
                       video_filters = NULL,
                       frames_per_second = NULL,
                       check_inputs = TRUE) {
  # Stop if there are no images
  stopifnot(length(images) > 0)
  # Normalize paths of images and output (return absolute path)
  images <- normalizePath(images)
  output_dir <- normalizePath(dirname(output))
  output <- file.path(output_dir, basename(output))
  # Stop if there is no audio 
  stopifnot(
    length(audio) > 0,
    dir.exists(output_dir)
  )
  # Stop if images and audio are the same length
  if (check_inputs) {
    stopifnot(
      identical(length(images), length(audio)),
      all(file.exists(images))
    )
  }
  # Read in wav file using tuneR::readWave()
  audio <- map(audio, tuneR::readWave)
  # pad wav file
  audio <- pad_wav(audio, duration = duration)
  
  if (verbose > 0) {
    message("Writing out Wav for audio")
  }
  if (verbose > 1) {
    print(audio)
  }
  audio <- match_sample_rate(audio, verbose = verbose)
  # reduce audio (list) to single value
  wav <- purrr::reduce(audio, tuneR::bind)
  # create path to store wave file
  wav_path <- file.path(output_dir, paste0("ari_audio_", get_random_string(), ".wav"))
  # write wave file
  tuneR::writeWave(wav, filename = wav_path)
  # output: wav file that contains a voiceover of the entire script
  
  if (cleanup) {
    on.exit(unlink(wav_path, force = TRUE), add = TRUE)
  }
  
  # converting all images to gif (if there any gif images)
  img_ext <- tolower(tools::file_ext(images))
  any_gif <- any(img_ext %in% "gif")
  if (any_gif & !all(img_ext %in% "gif")) {
    if (verbose > 0) {
      message("Converting All files to gif!")
    }
    for (i in seq_along(images)) {
      iext <- img_ext[i]
      if (iext != "gif") {
        tfile <- tempfile(fileext = ".gif")
        ffmpeg_convert(images[i], outfile = tfile)
        images[i] <- tfile
      }
    }
  }
  
  # create txt path
  input_txt_path <- file.path(output_dir, 
                              paste0("ari_input_", get_random_string(), ".txt"))
  
  ## on windows ffmpeg cancats names adding the working directory, so if
  ## complete url is provided it adds it twice.
  if (.Platform$OS.type == "windows") {
    new_image_names <- file.path(output_dir, basename(images))
    if (!any(file.exists(new_image_names))) {
      file.copy(images, to = new_image_names)
    } else {
      warning("On windows must make basename(images) for ffmpeg to work")
    }
    images <- basename(images)
  }
  
  # adds "file 'IMAGE_PATH'" and duration 
  # in a .txt file located at input_txt_path
  for (ii in seq_along(images)) {
    cat(paste0("file ", "'", images[ii], "'", "\n"),
        file = input_txt_path, 
        append = TRUE)
    cat(paste0("duration ", wav_length(audio[[ii]]), "\n"),
        file = input_txt_path, 
        append = TRUE)
  }
  # winslash: the separator to be used on Windows
  input_txt_path <- normalizePath(input_txt_path, winslash = "/")
  
  # needed for users as per
  # https://superuser.com/questions/718027/
  # ffmpeg-concat-doesnt-work-with-absolute-path
  # input_txt_path = normalizePath(input_txt_path, winslash = "\\")
  
  # find path to ffmpeg exectuable
  ffmpeg <- ari::ffmpeg_exec(quote = TRUE)
  
  # set video filters
  if (!is.null(frames_per_second)) {
    video_filters <- c(video_filters, paste0("fps=", frames_per_second))
  } else {
    video_filters <- c(video_filters, "fps=5")
  }
  if (divisible_height) {
    video_filters <- c(video_filters, '"scale=trunc(iw/2)*2:trunc(ih/2)*2"')
  }
  
  
  # workaround for older ffmpeg
  # https://stackoverflow.com/questions/32931685/
  # the-encoder-aac-is-experimental-but-experimental-codecs-are-not-enabled
  experimental <- FALSE
  if (!is.null(audio_codec)) {
    if (audio_codec == "aac") {
      experimental <- TRUE
    }
  }
  if (deinterlace) {
    video_filters <- c(video_filters, "yadif")
  }
  video_filters <- paste(video_filters, collapse = ",")
  video_filters <- paste0("-vf ", video_filters)
  
  if (any(grepl("-vf", ffmpeg_opts))) {
    warning("Found video filters in ffmpeg_opts, may not be used correctly!")
  }
  ffmpeg_opts <- c(video_filters, ffmpeg_opts)
  ffmpeg_opts <- paste(ffmpeg_opts, collapse = " ")
  # output: options to input into ffmpeg
  
  # shQuote should seankross/ari#5
  command <- paste(
    ffmpeg, "-y",
    "-f concat -safe 0 -i", shQuote(input_txt_path),
    "-i", shQuote(wav_path),
    ifelse(!is.null(video_codec), paste("-c:v", video_codec),
           ""
    ),
    ifelse(!is.null(audio_codec), paste("-c:a", audio_codec),
           ""
    ),
    ifelse(stereo_audio, "-ac 2", ""),
    ifelse(!is.null(audio_bitrate), paste("-b:a", audio_bitrate),
           ""
    ),
    ifelse(!is.null(video_bitrate), paste("-b:v", video_bitrate),
           ""
    ),
    " -shortest",
    # ifelse(deinterlace, "-vf yadif", ""),
    ifelse(!is.null(video_sync_method), paste("-vsync", video_sync_method),
           ""
    ),
    ifelse(!is.null(pixel_format), paste("-pix_fmt", pixel_format),
           ""
    ),
    ifelse(fast_start, "-movflags +faststart", ""),
    ffmpeg_opts,
    ifelse(!is.null(frames_per_second), paste0("-r ", frames_per_second), ""),
    ifelse(experimental, "-strict experimental", ""),
    "-max_muxing_queue_size 9999",
    "-threads 2",
    shQuote(output)
  )
  if (verbose > 0) {
    message(command)
  }
  if (verbose > 1) {
    message("Input text path is:")
    cat(readLines(input_txt_path), sep = "\n")
  }
  
  # IMPORTANT: run command in system
  res <- system(command)
  
  
  if (res != 0) {
    warning("Result was non-zero for ffmpeg")
  }
  
  if (cleanup) {
    on.exit(unlink(input_txt_path, force = TRUE), add = TRUE)
  }
  res <- file.exists(output) && file.size(output) > 0
  if (!cleanup) {
    attr(res, "txt_path") <- input_txt_path
    attr(res, "wav_path") <- wav_path
    attr(res, "cmd") <- command
  }
  attr(res, "outfile") <- output
  attr(res, "images") <- images
  
  # return a (temporarily) invisible copy of res
  invisible(res)
}

Image Files

slide1.png

slide2.png

slide3.png

slide4.png

slide5.png

slide6.png

Audio Files

audio_files.zip

Run ari_stitch()


# run ari_stitch() and save output
output <- ari_stitch(images = c("slide1.png",
                                "slide2.png",
                      "slide3.png",
                      "slide4.png",
                      "slide5.png",
                      "slide6.png"),
           audio = c("tts_output1.wav",
                     "tts_output2.wav",
                     "tts_output3.wav",
                     "tts_output4.wav",
                     "tts_output5.wav",
                     "tts_output6.wav"))


# get file path of mp4 file
attr(output, "outfile")

Navigate to this mp4 in Finder (Mac) and open the video

Message when running `res <- system(command)`:

ffmpeg version 5.1.2 Copyright (c) 2000-2022 the FFmpeg developers
  built with Apple clang version 14.0.0 (clang-1400.0.29.202)
  configuration: --prefix=/opt/homebrew/Cellar/ffmpeg/5.1.2_6 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libaribb24 --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librist --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libsvtav1 --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libspeex --enable-libsoxr --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack --enable-videotoolbox --enable-neon
  libavutil      57. 28.100 / 57. 28.100
  libavcodec     59. 37.100 / 59. 37.100
  libavformat    59. 27.100 / 59. 27.100
  libavdevice    59.  7.100 / 59.  7.100
  libavfilter     8. 44.100 /  8. 44.100
  libswscale      6.  7.100 /  6.  7.100
  libswresample   4.  7.100 /  4.  7.100
  libpostproc    56.  6.100 / 56.  6.100
-vsync is deprecated. Use -fps_mode
Passing a number to -vsync is deprecated, use a string argument as described in the manual.
Input #0, concat, from '/private/var/folders/bb/m2b0ry595ys7bfs1r397lnf40000gp/T/RtmpphX5JC/ari_input_gPwzkFnlk3HL.txt':
  Duration: 00:00:37.00, start: 0.000000, bitrate: 0 kb/s
  Stream #0:0: Video: png, rgb24(pc), 6000x3375 [SAR 23622:23622 DAR 16:9], 25 fps, 25 tbr, 25 tbn
Input #1, wav, from '/private/var/folders/bb/m2b0ry595ys7bfs1r397lnf40000gp/T/RtmpphX5JC/ari_audio_S5ZSsBGmGzBU.wav':
  Duration: 00:00:37.00, bitrate: 352 kb/s
  Stream #1:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 22050 Hz, 1 channels (FL), s16, 352 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (png (native) -> h264 (libx264))
  Stream #1:0 -> #0:1 (pcm_s16le (native) -> aac (native))
Press [q] to stop, [?] for help
[libx264 @ 0x14c63a330] using SAR=3374/3375
[libx264 @ 0x14c63a330] using cpu capabilities: ARMv8 NEON
[libx264 @ 0x14c63a330] profile High, level 6.0, 4:2:0, 8-bit
[libx264 @ 0x14c63a330] 264 - core 164 r3095 baee400 - H.264/MPEG-4 AVC codec - Copyleft 2003-2022 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=2 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=5 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to '/private/var/folders/bb/m2b0ry595ys7bfs1r397lnf40000gp/T/RtmpphX5JC/file3c302d23b261.mp4':
  Metadata:
    encoder         : Lavf59.27.100
  Stream #0:0: Video: h264 (avc1 / 0x31637661), yuv420p(tv, progressive), 6000x3374 [SAR 3374:3375 DAR 16:9], q=2-31, 5 fps, 10240 tbn
    Metadata:
      encoder         : Lavc59.37.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
  Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 22050 Hz, stereo, fltp, 128 kb/s
    Metadata:
      encoder         : Lavc59.37.100 aac
frame=  150 fps= 11 q=-1.0 Lsize=     658kB time=00:00:30.00 bitrate= 179.7kbits/s speed=2.16x    
video:286kB audio:364kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 1.219465%
[libx264 @ 0x14c63a330] frame I:2     Avg QP: 2.27  size: 77096
[libx264 @ 0x14c63a330] frame P:38    Avg QP:12.33  size:  1831
[libx264 @ 0x14c63a330] frame B:110   Avg QP:12.66  size:   622
[libx264 @ 0x14c63a330] consecutive B-frames:  2.0%  0.0%  2.0% 96.0%
[libx264 @ 0x14c63a330] mb I  I16..4: 95.0%  2.4%  2.5%
[libx264 @ 0x14c63a330] mb P  I16..4:  0.0%  0.0%  0.0%  P16..4:  0.1%  0.0%  0.0%  0.0%  0.0%    skip:99.9%
[libx264 @ 0x14c63a330] mb B  I16..4:  0.0%  0.0%  0.0%  B16..8:  0.1%  0.0%  0.0%  direct: 0.0%  skip:99.9%  L0:60.3% L1:39.7% BI: 0.0%
[libx264 @ 0x14c63a330] 8x8 transform intra:3.0% inter:3.7%
[libx264 @ 0x14c63a330] coded y,uvDC,uvAC intra: 1.7% 0.4% 0.4% inter: 0.0% 0.0% 0.0%
[libx264 @ 0x14c63a330] i16 v,h,dc,p: 99%  1%  1%  0%
[libx264 @ 0x14c63a330] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 79%  2% 19%  0%  0%  0%  0%  0%  0%
[libx264 @ 0x14c63a330] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 47% 16% 24%  2%  2%  3%  3%  2%  2%
[libx264 @ 0x14c63a330] i8c dc,h,v,p: 99%  1%  0%  0%
[libx264 @ 0x14c63a330] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0x14c63a330] ref P L0: 87.6%  1.9%  9.4%  1.1%
[libx264 @ 0x14c63a330] ref B L0: 56.8% 41.9%  1.3%
[libx264 @ 0x14c63a330] ref B L1: 97.9%  2.1%
[libx264 @ 0x14c63a330] kb/s:77.92
[aac @ 0x14c63b650] Qavg: 60691.250
```_

WISHLIST

Markdown-flavored MD - push button deployment/upload to API
- MD → HTML → PDF, script extraction, then API deployment
Embedded Videos in the video
Selective Hiding of text (integration development)
MARP to PDF. Marp is starting point.
Asset extractor - look in YAML, look for ![]() tags look for <img src and then deal with the fallout when people do other things