tsk-tsk / tsk-tsk Goto Github PK

View Code? Open in Web Editor NEW

81.0 4.0 4.0 2.24 MB

TSK - The Scripting Kit

License: MIT License

Shell 100.00%

scala scripting shell linux-shell macos

tsk-tsk's Introduction

TSK - The Scripting Kit

Truly Standalone Scala Scripts on Linux and Mac.

Make your Scala programs instantly runnable just by prepending it with a special (IDE / tooling neutral) preamble. The program becomes a self-installable-and-executable shell script, that everybody can run on their systems without any prerequisites.

The preamble (shell commands disguised as Scala comments) ensures that the prerequisites (everything needed to compile and execute the Scala program) get first downloaded and then used to run the program. Caching is used to skip unnecessary downloads and recompilations after the program has been run already.

The users of the scripts don't need to install or even know about JVM, SBT or any of the common tooling related to Scala programming. This makes TSK-powered scripts ideal for situations in which Scala would be normally rejected because of the installation complexity and set up overhead.

Use TSK when

you've done your exploration with Ammonite REPL, Polynote, Scala Worksheets, Scala Fiddle, Scastie and want to put your findings into work
you want the result quickly: going for a proper build tool would be an overkill at this stage
but at the same time you want full IDE support and good development experience
you want it to be easy to switch your code to a proper build tool later
you want the script to be useful not only to you, or to your Scala developer colleagues, but also to those data science and Python experts next door, who may not really know (or care for) JVM, Scala or related tooling
you don't want to have to care if the target system (colleague's laptop, CI server, Kubernetes pod) has everything installed already

So TSK will be great for:

that small script you will use to automate the common task in your company, especially if the users are not really into the JVM / Scala world
the actually-runnable example code that demonstrates features of your library
that little diagnostic program that you'll run inside of a problematic Kubernetes pod to figure out what's happening
that quick prototype that you want to experiment with, before deciding if it would pay off to invest into a proper build configuration, CI setup and other software-engineery things like that
that small Scala program you want to play with while learning Scala features, before you decide to learn about JVMs, build tools and other distractions

Don't use TSK when

you just want to play with Scala syntax, to explore some individual API methods, to test some smaller components, to perform some quick computations: for these purposes use any of the great Scala exploratory tools like Ammonite REPL, Polynote, Scala Worksheets, Scala Fiddle, Scastie
you're starting a new project that you already know is going to be a bigger thing, in that case go straight for a proper build tool, like SBT or Mill
you want to rely on features of build tools, like compiler plugins, custom warning settings, multi-projects, test frameworks, code coverage measurements, etc.
you don't mind spending extra effort to provide the best possible experience for your end users: in that case use the respective system's package management software or provide an appropriate platform-dependent installer like IntelliJ IDEA has
you (and/or your users) are on Windows without WSL or on any other platform that is not yet supported: Ammonite Scripts will be of help to you
or, in general, when other Scala tools work well for your use-cases already

Example

Say you've got some unstructured text file containing URL addresses and that you need to extract unique URLs. Maybe you thought of using grep for that - but on a closer inspection URLs are quite involved beasts! With all the query parameters, escaping etc. the regular expression may be difficult to get right (let alone the readability of the end result). You would be better off using some proper URL validation method, as typically found in programming languages. Ideally the language would be suitable for scripting, so it'd be straightforward to write and run a program without you (and your users) having to fight tooling / dependencies.

Luckily Scala is one of the languages in which URL validation is available (within the standard library via Java SDK) and with help of this project it is well suited for scripting as well. You can make a runnable URLGrep script using Scala and TSK in a couple of minutes by following the steps below:

Save the following snippet into URLGrep.scala file (or download it):

// 2> /dev/null; . $(curl -sL https://git.io/boot-tsk | sh); run; exit

object URLGrep {

  def main(args: Array[String]) = {
    val urls = for {
      line  <- io.Source.stdin.getLines()
      token <- line.split("\\s+")
      clean <- token.split("[\"']")
      if util.Try(new java.net.URL(clean)).isSuccess
    } yield clean
    urls.toSet.toSeq.sorted.foreach(println)
  }

}

Set the executable bit:

chmod +x URLGrep.scala

Voilà! Your script is good to go. Pipe some text to its standard input to try it out:

echo "something something http://google.com" | ./URLGrep.scala
curl https://scala-lang.org | ./URLGrep.scala

The first run will take quite long (maybe even a couple of minutes) because TSK needs to download several dependencies in the background first. The second and next runs will be quick, because everything is cached on disk already.

One nice thing is that no matter on which Linux or Mac system you run it, it will work, as long they have curl or wget installed. Think - you can pass that script to your coworkers, it'll work without any cumbersome tool installation needed. The same for Docker containers - as long the image has curl / wget, your program will work there.

The above example is meant to whet your appetite by demonstrating some useful basics, but the real fun begins with external libraries that can come from both public and company-private repositories.

Make sure you glance over the features section and also see the wiki for some guidance and explanations that go into more detail. You can also play with the attached examples to see various features of TSK in action.

Main features

The simplest possible workflow: you write the script, and you make it executable. That's it. The initial script run downloads those of the required dependencies that don't exist on the machine yet.
All Scala and Java libraries available, as long they are in some Maven or Ivy repository ( internal corporate repositories requiring credentials and/or proxies are supported as well - use your in-house components in your scripts)
Regular Scala, without any syntax that'd confuse standard tooling (editable without red squiggles in IntelliJ IDEA). When your script grows somewhat, but not to a degree when it'd need a full-blown project, split it into separate files with all Scala constructs (like packages) working as expected
Minimal prerequisites - apart from a working internet connection you only need wget or curl.
Support for:
- macOS (tested on AppVeyor, also had some positive user reports)
- fresh Docker images of the following Linux distributions:
  - Alpine, Arch Linux, Fedora out of the box (they've got curl / wget)
  - Debian, Ubuntu after you install curl
- most likely your Linux distribution even without root permissions as long you've installed curl or wget
Experimental support for Ammonite scripts (use run_with_ammonite instead of run)
Compilation of your script to a native binary (with GraalVM) to reduce the script's memory footprint and startup time

Planned features

Easy migration to a full-blown Scala project when the script grows. The script is valid Scala, so the existing tooling handles it perfectly well - the TSK-specific parts are hidden from the Scala compiler within the Scala comment block. TSK will be able to generate SBT and Mill projects
Robust handling of errors made in the shell part (preamble)

Acknowledgements

TSK stands on the shoulders of giants. Kudos to all authors and contributors of the following technologies:

Scala, which is my favorite programming language
Ammonite - the best Scala REPL, which also pioneered Scala scripting capabilities
Coursier, which made it super-easy to manage Scala and Java dependencies
Bloop, which provides great compilation and IDE interoperation features
Unix, with fantastic scripting capabilities

Special thanks

To that ScalaPolis2016 attendee, who noticed, that it's possible for a file to be both valid Scala and valid shell.
To those of the ScalaPolis2016 and FunctionalTricity 28.04.2016 attendees, who have appreciated my points and to those who made fun of them. I enjoyed our conversations very much 😄
To all people who gave me feedback after my talk on TSK at the Scala Love 2021 conference
To the wonderful organizers of the mentioned events for having me and for still wanting me to speak! 😉

tsk-tsk's People

Contributors

Stargazers

Watchers

Forkers

vaslabs dantodor 2m vic

tsk-tsk's Issues

Enable pretty preamble

It should be possible to shorten the TSK preamble to:

package app /* 2> /dev/null
v=1.0.0; . $(u=https://git.io/boot-tsk-$v; (cat ~/.tsk/boot-tsk-$v || wget -O - $u || curl -fL $u) | v=$v sh)
run "$@"; cat "${tsk_log}" >&2; exit 1 # */

The boot-tsk would be a TSK bootloader that would:

ensure that the actual TSK is there - if it's not, then download it
output the path to actual TSK so the source command interprets it

The drawback would be that on the first ever call (for that version) there is one more network roundtrip, but given we need to download plenty of things anyway, that should be relatively insignificant.

Export to Mill build

Generate a skeleton of Mill project basing on declared dependencies. Ignore exclusions and private repositories for now.

Do not start Bloop unless necessary

If the script has not changed since its last compilation there is no need to run Bloop.

TSK needs to detect this situation and not even try to run Bloop then.

If dependencies are empty, add Scala standard library

Try to detect main class for Scala 3 programs

Right now one needs to use main_class='ScriptName$package' which is obviously inconvenient and defeats the point of top-level definitions.

URLGrep example doesn't work for me - macos big sur

This looks amazing! Thanks so much for making this, i just tried it on macos big sur 11.2.1 and got this error

Error: Could not find or load main class Compiled from "URLGrep.scala"public final class URLGrep {  public static void main(java.lang.String[]);}```

Regression: make it work on Alpine again

Bloop binary fails because it's compiled with standard libc. Use the JVM version of Bloop on Alpine.

Depend on explicit versions of Coursier and Bloop

and also make it possible to override these versions within the user script.

Support final version of Scala 3

The following tsk script:

// 2> /dev/null \
/*
scala_version="3.0.0"
source $(curl -sL git.io/boot-tsk | sh)
run
exit
*/

@main def main() = println("ohi")

fails to run as it resolves compiler artifact name to scala3-compiler_3.0.0, when in fact it is scala3-compiler_3:

~ ─╼ sh Scala3.scala                                                                                                                                                                                                                                         
Fetching all libraries the script depends upon
Resolution error: Error downloading org.scala-lang:scala3-compiler_3.0.0:latest.release
  not found: /Users/martynas/.ivy2/local/org.scala-lang/scala3-compiler_3.0.0
  not found: https://repo1.maven.org/maven2/org/scala-lang/scala3-compiler_3.0.0/maven-metadata.xml
Error downloading org.scala-lang:scala3-library_3.0.0:latest.release
  not found: /Users/martynas/.ivy2/local/org.scala-lang/scala3-library_3.0.0
  not found: https://repo1.maven.org/maven2/org/scala-lang/scala3-library_3.0.0/maven-metadata.xml
Compiling Scala3 (1 Scala source)
Compiled Scala3 (378ms)
Resolution error: Error downloading org.scala-lang:scala3-library_3.0.0:latest.release
  not found: /Users/martynas/.ivy2/local/org.scala-lang/scala3-library_3.0.0
  not found: https://repo1.maven.org/maven2/org/scala-lang/scala3-library_3.0.0/maven-metadata.xml
Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.NoClassDefFoundError: scala/util/CommandLineParser$ParseError
	at java.lang.Class.getDeclaredMethods0(Native Method)
	at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
	at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
	at java.lang.Class.getMethod0(Class.java:3018)
	at java.lang.Class.getMethod(Class.java:1784)
	at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:650)
	at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:632)
Caused by: java.lang.ClassNotFoundException: scala.util.CommandLineParser$ParseError
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
	... 7 more

Support internet proxies

for corporate environments mostly

Export to SBT build

Generate a skeleton of SBT project basing on the declared dependencies.
Exclusions don't need to be supported.

Error running from zsh - set: Illegal option -o pipefail

Hi @przemek-pokrywka very nice tool :)

I'm facing the following issue trying it in Ubuntu and Zsh (Zsh is my login shell):

❯ echo "something something http://google.com" | ./URLGrep.scala
./URLGrep.scala: 706: set: Illegal option -o pipefail

❯ ./Joke.scala 
./Joke.scala: 414: set: Illegal option -o pipefail

But if I run it from a Bash shell instead (or bash -c 'command'), it works fine:

❯ bash -c 'echo "something something http://google.com" | ./URLGrep.scala'
http://google.com

❯ bash -c './Joke.scala' 
Why is 6 afraid of 7 in hexadecimal Canada?
3...
2...
1...
Because 7 8 9 A?

I guess that the issue is related to the following, but I don't know why is only zsh affected:

More info:

sh in Ubuntu is a symlink to /usr/bin/dash, which seems to not support -o pipefail
zsh supports -o pipefail
zsh is my login shell

❯ echo $SHELL                                                             
/usr/bin/zsh

❯ which sh                 
/usr/bin/sh

❯ l /usr/bin/sh            
lrwxrwxrwx 1 root root 4 mar 23 14:49 /usr/bin/sh -> dash

Thanks!

Fine-tune verbosity levels

The current default verbose=false creates an impression of script freezing when it downloads large files (JVM, Coursier).
Ideally the default would be some middle point between false and true, so:

user sees some indication of things happening in the background, ideally a progress bar
the output is not overly verbose

Consider not keeping Bloop project in the script's directory

Having .bloop dir close to the script is great because the script is immediately editable in editors that support Bloop (IntelliJ, VS Code), but some users might prefer to not pollute the script's directory, especially if it also contains some projects compiled by Bloop. In that case the Bloop directory might be held in ~/.tsk/cache/tsk_version/compilation/SHA_OF_SCRIPT_ABS_PATH, slightly analogous to what Ammonite does.

Review this default and propose a clear and frictionless way of supporting the alternative.

Reuse existing JVM if possible

If java_version is not specified, try to find any JVM that might be available on the target system.
If java_version is specified and a JVM with that version can be found on the target system, use it, even when Coursier doesn't manage it.

This is to avoid waste of time, bandwidth and disk space.

Kill Bloop on a known Bloop issue

If TSK log contains

error: scala.reflect.internal.MissingRequirementError: object java.lang.Object in compiler mirror not found.

then it is highly probable that Bloop server broke.
It is better to kill it instead of having the user to figure out that on their own.

Add preamble that uses existing installation of tsk

I do not want a curl request everytime my scripts are run (both for offline access and security implications). I like to install tsk once, and use a preamble that uses this installation.

Some unrelated questions:

Does the preamble assume a particular shell is used to open the file? Which shell will be used? The value of $SHELL ?
I could not find the documentation on compiling the script to a native binary.

version of java is incorrectly reported

the PATH needs to be set before the nested command -v

Add watch flag for automatic script rerunning

I think it would be good if TSK supported something like the --watch flag from Ammonite, where the script is rerun automatically on file changes.

For reference https://ammonite.io/#WatchandReload

Document features of the 0.0.3 version

how to change Scala version
how to declare dependencies
how to specify private repositories
how to specify forced versions of artifacts
how to specify verbosity
which versions of Coursier and Bloop are used
how exactly the shell preamble works

Reduce default verbosity so users can pipe script outputs to other processes easily

Unless explicitly requested by verbosity=verbose, the messages from tsk internals need to be hidden from the user, because the user may wish to redirect the output of their script to some other process and it would be inconvenient to have to skip the script startup logs.

On the other hand, if execution of the script would fail, then all the logs are ideally to be displayed.

That may be achievable by writing to a temporary file and then, depending on the course of action, to either delete it (if all is okay) or to write it to standard error (and only then delete it).

Executable scala file not able to run on fish

Having the following preamble at an example.scala file with the executable bit set, I'm able to run ./example.scala in both Bash and Zsh, however fish-shell complains with the error bellow.

// 2> /dev/null; source $(curl -sL https://git.io/boot-tsk | sh); run; exit

fish error:

❯ ./example.scala
Failed to execute process './example.scala'. Reason:
exec: Exec format error
The file './example.scala' is marked as an executable but could not be run by the operating system.

Support Replit?

https://replit.com/@AEthyr/Scala-Tsk-Tsk

> ./HelloReplit.scala 
FATAL ERROR: Error while pre-warming Bloop

Full log:
---------
cat: /tmp/tsk-1425.log: No such file or directory

> ls -altr /tmp/tsk*
-rw-r--r-- 1 runner runner 634 Dec 22 00:49 /tmp/tsk-69.log
-rw-r--r-- 1 runner runner 634 Dec 22 00:53 /tmp/tsk-711.log
-rw-r--r-- 1 runner runner 634 Dec 22 00:55 /tmp/tsk-989.log
-rw-r--r-- 1 runner runner 634 Dec 22 00:56 /tmp/tsk-1109.log
-rw-r--r-- 1 runner runner 634 Dec 22 00:57 /tmp/tsk-1429.log

Docs: Add links to similar efforts in README

ammonite-runner
catscript

Publish tsk and boot-tsk to Maven Central

This is important especially for corporate users that have limited access to the internet.

Also it eliminates one security risk: that one gets access to git repo and overwrites a tag with a malicious code.

Make TSK preamble more concise

Check if it would be possible to have more concise preamble, like:

// 2> /dev/null \
/*
source $(curl -sL git.io/boot-tsk | sh)
 */
object Hello extends App { println("Hello") }

or even

// 2> /dev/null; source $(curl -sL git.io/boot-tsk | sh)
object Hello extends App { println("Hello") }

It seems to be possible if:

TSK would only set variables if they are empty
TSK not jumps into running the script if it detects that it is the debug mode (source ~/.tsk/tsk Script.scala)
TSK does jump into running the script otherwise

Consider turning TSK into a CLI app

The app could be more convenient to use in certain situations, like:

exporting script (set of scripts) to a full-blown SBT / Mill (/Bloop?) project
adding a dependency npm-style (but also with tab completion powered by Coursier)
turning the script (set of scripts) into a native binary powered by Coursier + GraalVM

Halt Bloop project generation in case of conflict

Bloop projects should only be (re)generated in two situations:

starting from scratch (no Bloop project with that name yet)
regenerating Bloop config for a TSK-powered project (TSK needs to put a marker inside of the Bloop config in order to make it possible to detect that the project is managed with TSK)

Make it easy to source include tsk in a shell session

Due to presence of exit statements tsk is not convenient to run within an existing shell session. The exits need to be replaced by returns so the shell session doesn't exit abruptly in case of some errors.

Include only the current file by default

With normal shell scripts one expects one can put it into (reasonably) any directory.
That includes ~/bin and even user's home.

With TSK that is a problem right now, because it uses Bloop that tries to compile all files from the script's directory (and from all subdirectories). Then multiple completely unrelated source files may end up being tried to be compiled, which will very often fail spectacularly.

For that reason Bloop should only try to compile the single script file, unless explicitly instructed to do otherwise with a new custom TSK setting (sources for example).

Document features of 0.0.20 version

how to edit the sources in IntelliJ IDEA with full highlighting and code navigation
how to specify Java version to use on Alpine
how to specify explicit version of tsk so the script can be run offline

Can the user simply write `run` instead of `run "$@"`?

Docs: add a section about security

Especially elaborate on the threat model and curl | sh, add an example of script that doesn't use curl | sh

Run the compiled classes directly with JVM rather than with bloop run

A limitation of bloop run is that the standard input can't be passed to the script, which prevents some popular use-cases. Also Bloop sometimes hangs and it's not easy to unfreeze it.

For these reasons it may be better to run the classes directly after determining which of them is the main one. It may be reasonable to impose a convention on users, that the name of the main class follows the name of the Scala source file if one doesn't want to specify the main class name explicitly. The explicit main_class variable should be also supported for non-standard cases.

Support repository credentials

In order to be able to fetch artifacts from some company-internal repositories

tsk-tsk / tsk-tsk Goto Github PK

tsk-tsk's Introduction

TSK - The Scripting Kit

Use TSK when

Don't use TSK when

Example

Main features

Planned features

Acknowledgements

Special thanks

tsk-tsk's People

Contributors

Stargazers

Watchers

Forkers

tsk-tsk's Issues

Recommend Projects

Recommend Topics

Recommend Org