lifthrasiir / roadroller Goto Github PK
View Code? Open in Web Editor NEWRoadroller: Flattens Your JavaScript Demo
Home Page: https://lifthrasiir.github.io/roadroller/
License: Other
Roadroller: Flattens Your JavaScript Demo
Home Page: https://lifthrasiir.github.io/roadroller/
License: Other
-O1
has been default in the CLI since 2.0.0, but the online demo doesn't optimize until the button is pressed. Ideally though the demo should instantaneously behave after the textarea update with default parameters (-O0
) then optimize in background to produce the -O1
equivalent, so that the demo more or less behaves identically to the CLI. This obviously requires the web worker (#5).
The current Roadroller algorithm takes linear time relative to the input length, so the length-reducing preprocessor is desirable as long as it doesn't interfere with context modelling. In fact the current JS preprocessor is one of them (it essentially replaces "text" models where the context unit is a word or token with more general bytewise models). But it doesn't work for non-JS inputs, so we need more general solutions.
In particular LZP is a promising candidate because of its simplicity. I even hoped that for shorter inputs LZP-only mode can be possible, but it doesn't seem to be the case at the moment (PNG bootstrap works better).
see demo.html line 424. "dynamicModels" is constant, but "dynamicModels = undefined" in line 426.
Context mixing commonly uses the logistic function f(x) = 1/(1+exp(-x)) as an activiation function, but it is not the only possibility. Since Math.log/exp is quite expensive and given their accuracy is implementation-defined (although all known browsers use the same approximation), it is worthwhile to see if an alternative activation function can be used.
I've personally tried two possibilities g(x) = x/sqrt(1+x^2) and h(x) = x/(1+abs(x)). (Note that IEEE 754-2008 requires sqrt to be correctly rounded.) They are roughly similar to the original logistic function after scaling:
g(x) turned out to be a drop-in for the logistic function, while h(x) required an adjustment to the learning rate (by about 2-10). As an example the following is a required change against 3f60e44 for g(x).
for (let i = 0; i < this.models.length; ++i) {
const weight = this.weights[i];
- const prob = this.models[i].predict(context) * 2 + 1;
- const stretchedProb = Math.log(prob / ((2 << this.precision) - prob));
+ const prob = (this.models[i].predict(context) * 2 + 1) / (1 << this.precision) - 1;
+ const stretchedProb = prob / Math.sqrt(1 - prob * prob);
this.stretchedProbs[i] = stretchedProb;
total += weight * stretchedProb;
}
// since CM is the last model and predictions are not stored,
// we can just compute the external probability directly.
- const mixedProb = (2 << this.precision) / (1 + Math.exp(-total));
+ const mixedProb = (total / Math.sqrt(1 + total * total) + 1) * (1 << this.precision);
if (mixedProb >= (2 << this.precision)) {
throw new Error('LogisticMixModel.predict: weighted average overflow');
}
// calculate the mixed prediction q
`x=u.map((j,i)=>(` +
- `y=p[j]*2+1,` +
+ `y=(p[j]*4+2)/M-1,` +
// stretch(prob), needed for updates
- `y=Math.log(y/(M-y)),` +
- `q-=w[i]*y,` +
+ `y/=Math.sqrt(1-y*y),` +
+ `q+=w[i]*y,` +
// premultiply with learning rate
`y${learningRateNum == 1 ? '' : '*'+learningRateNum}/${learningRateDenom}` +
`)),` +
// q: squash(sum of weighted preds) followed by adjustment
- `q=M/(1+Math.exp(q))|1,` +
+ `q=(q/Math.sqrt(1+q*q)+1)*M/2|1,` +
// decode the bit b
`b=t%M<q,` +
`t=(b?q:M-q)*(t>>${precision + 1})+t%M-!b*q` +
Unfortunately both were not enough for replacing the logistic function, g(x) was 1--400 bytes larger while h(x) was 1000+ bytes larger for most samples I've tried. Really unfortunate, as g(x) were significantly faster than f(x) in V8 (by about 10%). Any other suggestions are welcomed.
Although Node has modules support, many modern build processes still rely on CJS exports.
Maybe support multiple environments with something like rollup:
// rollup.config.js
[
{
input: `./index.js`,
output: { file: `dist/index.js`, format: 'esm' }
},
{
input: `./index.js`,
output: { file: `dist/index.cjs`, format: 'cjs' }
]
}
]
// package.json
{
"main": "dist/index.cjs",
"module": "dist/index.js"
}
A syntax error (or possibly unsupported future syntax like #23) currently throws a fixed exception. It should instead point to the context and line number. The column number is also good to have, but the accurate calculation of the column number is much harder than it seems (tab expansion, surrogate pairs, East Asian width etc).
PackerOptions.disableWasm
is already recognized but missing in d.ts. We also need an equivalent option in the CLI, probably --wasm|--no-wasm
(with no shorthands).
-OO
currently starts a new search at the previous best parameters, but running -O2
multiple times independently might have a better result. There may be multiple reasons: linear parameters would always stuck on local minima, and SA might not be sufficient to escape from the current local minimum. For that reason it was suggested that -O3
should just run -O2
(or equivalently and more efficiently, only the simulated annealing part) 10 times and keep the best among them.
This is hinted in some commented codes, but we need to be able to put multiple inputs into a single compressed stream:
The exact design is not yet finished, but it almost surely involves an additional code in the evaluated JS input and special identifiers like $ROADROLLER$INPUT_NAME
.
Currently each wasm compressor instance holds its own memory so we can't blindly increase the number of instances. Instead we can make it shared (and update the marking mechanism accordingly), so that we can control the memory cache ourselves. This can improve the max memory used during the compressor.
The current wasm compressor produces an array of bitwise probabilities (count = input.length * inBits
) that is fed to ANSEncoder
, but the optimization process only needs the number of bytes used so this can be massively simplified. The length of the first line is log_2(PROD_i { predictions[i] / 2^(precision + 1) })
plus the coding overhead, while the length of the second line is currently independent of the exact content of the first line.
I've experimentally implemented this and found that while it is indeed faster, the estimation was slightly off for an unknown reason---probably the coding overhead is not as insignificant. This can be a problem at the later stages of the optimization where each improvement is as small as a single byte. Whether this is relevant or not hasn't been yet investigated.
I though it wasn't much of issue, but it is clearly causing an UX problem. U+007F seems okay, but U+001F doesn't render in many environments so an alternative would be appreciated.
Maybe I can use nvm or similar to test locally, or set up the CI instance to make sure that.
Hello greatest coder in South Korea (unified Korea?)
Anyway, I love your work on this. It's crazy. Good. Good crazy. And crazy good too!!! :p ;) xx ;p
Can you please make a parcel plugin? You probably won't, because it's too much work I guess and you don't need to.
I can hack this by running roadroller on my scripts after bundling with parcel...but I would just love to be able to throw it in.
OMG that would be great.
Thank you kindly sir, you are the greatest programmer in the Koreas (i believe)!!! :p :) xx ;p
cris
Thanks for your awesome and inspiring work! I am working on a somewhat similar effort (https://github.com/eyaler/ztml) and wanted to compare compression performance with roadroller. I ran some initial benchmarks which you can see in the readme, however, my use case is the one where everything has to be in a JS single file (no following gzip), which is not fair for roadroller. I found the commented out --uncompressed-only
... not sure if this is something you are planning to implement, or otherwise do you have any suggestions on how to change your code for this goal.
The online demo currently does the compression and optimization in the main thread. While it does yield whenever possible the slowdown is noticable. It is currently implemented in that way because we need to split (or possibly duplicate) non-worker and worker code otherwise; it can be done today with some build-level hacks, but it is annoying and I prefer to do it with a proper multiple input mechanism instead (#8).
Hello, this is incredible.
I checked the license, and "any decoder code is put in the public domain." Is that to be take to mean if I compress a web app bundle with roadroller then the web app code will be public domain?
I would love to use this to package up resources, but I couldn't if it would change the license of what I applied it to.
This is an approach used by base-122 and optimal if you are using UTF-8. It recognizes the forbidden code units (in this case, \r
, $
, `
, \
and possibly \0
) and encodes it and the following code unit into two-byte UTF-8 sequences:
0xxxxxxx for allowed code units xx
110xxxyy 10yyyyyz for forbidden code units xx (indexed) and following code unit yy
It should be noted that two-byte UTF-8 sequences have up to two spare bits to spare.
z
above).We may need a change to ANSEncoder to leverage this spare bit, so it is not as trivial.
Roadroller compression library can make use of shared presets for better compression, but it is unclear how to leverage presets in the generated decoder. I have considered two possibilities:
line 301 in demo.html:
sel === sel
is always true. what is it mean?
xem's int2binary2html gives three approaches to encode binary data. The second is based on single-byte encoding (#31), the third is not as efficient as the current six-bit encoding (documented in #29), but the first is based on XHR/fetch which can be compatible over many settings.
The modus operandi is as follows: the HTML contains a hidden chunk of binary data (which can be interpreted as U+FFFD by browsers). JS then reads the HTML file itself as an ArrayBuffer and extract required bytes. The decoder would then look like this:
with(new XMLHttpRequest)open('GET','#'),responseType='arraybuffer',send(),onload=r=>{...} // XHR
fetch('#').then(r=>r.arrayBuffer()).then(r=>{r=new Uint8Array(r);...}) // fetch
<canvas id=a><script>/* Roadroller output from a JS code using a */</script>
This won't work, because a
(among others) gets assigned during the decompression and it shadows window.a
.
The problem is that this might be okay depending on the input, and we definitely want to exploit that whenever possible. There are ~16 variables used by Roadroller, carefully chosen so that the decoder gets compressed more, and changing any of them can have undesirable effects. One possible resolution is to put everything into a giant function:
eval(((A='....', t=1234, a, c, f, l, ...) => {
/* code */
})())
But I'm not sure how to make it compatible with multiple inputs (#8).
For now you should avoid using the following ids in your HTML, either outside <script>
or written with document.write
.
A M a c e f l m o p r t u w x y
If you can't avoid using them there are two other workarounds:
window.
to all uses of such variables.delete VARIABLE;
for all affected variables at the beginning of your code.Rollup is pretty cool for bundling ES modules. If you want to minify the generated code, you need to use a plugin like "uglify" or whatever. I was thinking that a roadroller rollup plugin may be useful and would be pretty cool.
Also -- Happy New Year @lifthrasiir! and good to see you are working on many new compression innovations! :)
Currently optimization is fully probabilistic, which can be an issue for automated build process. For now I'm not particularly keen to the seeded optimization because there has been a great variance across multiple optimization runs (for example, a particularly lucky run can save 10--100 additional bytes), but that's a possibility.
Roadroller strongly depends on DEFLATE's Huffman coding because JS string literals are not efficient in terms of information entropy (~6.96 bits per byte). The first line specifically exploits this by using the minimum amount of literal bytes, but a stock zlib doesn't fully recognize this difference in the symbol distribution and normally combines two lines into one block. Zopfli-like tools do recognize this, but the user has to use those tools to benefit from this.
Maybe we can solve this UX problem by directly producing an optimal ZIP/gzip file from Roadroller itself. This is not a small task because:
<script>
tag.While Roadroller somehow has a working implementation of zlib (-5), the optimal size can only be reached with Zopfli or similar tool so Roadroller should depend on that.
The current model assumes that every opening quote is paired with the subsequent closing quote, which is not the case for escaped quotes. We probably should transform string literals so that every escaped quote is replaced with hex escapes.
I think (α=α*997+(ο[λ-δ]|0)|0)
should be α=α*997+(ο[λ-δ]|0)
Hello @lifthrasiir
I ran
$ npx roadroller src/bang.js
on this file: https://github.com/i5ik/BANG/blob/799c5ed70bf0585b11024a1597f21fe5691ead74/src/bang.js
And received this output:
Error: invalid JS code in the input
The file src/bang.js
runs correctly in the browser (as you can see)
Above, I ran roadroller on Node v14.17.6
Below, I used nvm to try node v16.9.1 and re-installed roadroller@latest and ran with -v
and got the same error as well as
Actual memory usage: 145.9 MB (out of 150 MB)
Error: invalid JS code in the input
I tried with more memory --M
and had
Actual memory usage: 600.8 MB (out of 1000 MB)
Error: invalid JS code in the input
Just leaving this report here for information.
Thank you again for this library!
The current decompressor is optimized for size and very slow (<50 KB/s). I've been asked about the possibility to use Roadroller for larger inputs, possibly around the single digit megabyte range, and while Roadroller is generally not a good choice for them I've heard of several (rarer) use cases where Roadroller can actually help.
Possible improvements:
current:
for(;e<size;)for(x=1;x<128;x=x*2+U)
suggest:
for(;x=e<size;)for(;x<128;x+=x+U)
Currently compression parameters have to be manually copied from the search result, but it would be great to have an official method to save that. This would require a sort of the configure-from-file option, or we can combine it with the cache output file so that using the same cache parameter will automatically read and write the cache. It is best to have the same feature also in the API, but I'm not sure I can make it work without sacrificing non-Node.js uses.
It is technically possible to abbreviate (i.e. convert into unused single bytes) non-identifier tokens in the JS preprocessor, but I'm not very sure about the outcome. For example if the optimal number of abbreviations is close to the maximum, some previously abbreviated tokens may have to be unabbreviated and that can result in worse compression.
The current -O1
targets 1--2 minutes and currently performs about 300 compressions, which is prohibitively slow in interactive uses. I'd like to have a faster version that takes less than 10 seconds so that it can be inserted to the regular build process. If this works well we can possibly make -O1
default as well.
Is it possible to use this compressor while adhering to CSP unsafe-eval policies? (i.e. avoid the use of eval(), new Function() etc.)
This only breaks the optimizer, which was why I never realized this (I only tested an one-shot compression).
Roadroller as a library should be usable in the web environment so we can't blindly import perf_hooks
. It is more troubling that we can't exactly use await import('perf_hooks')
either because Node.js 14 doesn't have a top-level await. I think Packer.optimize
should just have a local performance
variable which is initialized either from window.performance
or (await import('perf_hooks')).performance
whichever is available.
The normal JS string literal is limited in its information density, but if we have a control over the character encoding we can overcome this limitation. xem's int2binary2html has used ISO/IEC 8859-1 (Latin-1, l1
) for example, which requires 88 bytes to decode:
b=i.charCodeAt(),b>>8?"€ ‚ƒ„…†‡ˆ‰Š‹Œ Ž ‘’“”•–—˜™š›œ žŸ".indexOf(i)+128:b
// note that the string literal above is also encoded in Latin-1 so they are one byte per character.
After some testing I concluded that ISO/IEC 8859-5 (cyrillic
) is a better choice, which can be decoded in 44 bytes of code due to its regular assignment:
b=i.charCodeAt(),b>>8?b%3683-864:b-167?b:253
Unlike 8859-1 we need to declare the character encoding for 8859-5, but <meta charset=cyrillic>
is only 23 bytes long so it is still better than 8859-1.
A critical problem with this approach is that the server-side header is preferred over the client-side metadata, so it is very sensitive to the server setting. In the case of js13kGames the server originally didn't declare the character encoding but it now does, breaking past entries that have used this technique.
In the output, I noticed that it's using the new
keyword:
new Uint16Array(12<<22).fill(1<<15),new Uint8Array(12<<22),0,0,0)
Correct me if I'm wrong, but I don't think it's necessary and will remove a few bytes!
Uint16Array(12<<22).fill(1<<15),Uint8Array(12<<22),0,0,0)
Deno is yet another JavaScript runtime.
Deno users mainly import scripts hosted in deno.land rather than NPM.
Here's how to publish it: https://deno.land/x?query=roadroller
This issue lists some wild ideas I had in mind but scrapped for various reasons. Maybe some of them might be useful in other projects.
It turns out that we can put the source map comment into eval
ed code and devtools do recognize it. This should not present in the final build, so I think the resulting code should be made obvious that the source map is in use, like this:
eval(Function(/* compressed data here */)(...)+'\n//# sourceMappingUrl=foo.js.map')
We would have to preserve the existing sourceMappingUrl
comment and update it (and also sourceURL
) if the JS preprocessing is in use.
Try this in the online demo:
const f = (t) => 1 - --t;
console.log(f(1));
That generates:
eval(Function("[M='EGLYWeQK{NHDV^vz}BAr`{yaBxjDGZic_CHz]FFSD{|c'"
,...']charCodeAtUinyxp',"for(;e<38;c[e++]=x-128)for(x=1;x<128;n=p.map((i,y)=>(t=r[i]*2+1,t=Math.log(t/(h-t)),A-=a[y]*t,t/500)),A=~-h/(1+Math.exp(A))|1,U=o%h<A,o=o%h+(U?A:h-A)*(o>>17)-!U*A,p.map((i,y)=>(t=r[i]+=(U*h/2-r[i]<<13)/((C[i]+=C[i]<5)+1/20)>>13,a[y]+=n[y]*(U-A/h))),x=x*2+U)for(p='010202103203210431053105410642065206541'.split(A=0).map((i,y)=>(t=0,[...i].map((i,y)=>(t=t*997+(c[e-i]|0)|0)),h*32-1&t*997+x)*12+y);o<h*32;o=o*64|M.charCodeAt(d++)&63);return String.fromCharCode(...c)")([],[],1<<17,[0,0,0,0,0,0,0,0,0,0,0,0],new Uint16Array(51e6).fill(1<<15),new Uint8Array(51e6),0,0,0))
If you replace eval
with console.log
, the result is:
const f=(t)=>1---t;
console.log(f(1));
Which is invalid. Invalid left-hand side expression in postfix operation
The white space between the minus -
and the decrement --
is significant.
npx roadroller -M 1024 -O2 -D -a write -t text input_file.txt -o output_file.js
i get:
file:///D:/Users/user/node_modules/roadroller/wasm.mjs:677
new Uint8Array(memory.buffer, M_INPUT).set(input);
^
RangeError: Start offset -1879046368 is outside the bounds of the buffer
at new Uint8Array (<anonymous>)
at file:///D:/Users/user/node_modules/roadroller/wasm.mjs:677:9
at compressWithDefaultModel (file:///D:/Users/user/node_modules/roadroller/index.mjs:571:45)
at Function.doPack (file:///D:/Users/user/node_modules/roadroller/index.mjs:971:75)
at calculateSize (file:///D:/Users/user/node_modules/roadroller/index.mjs:1363:35)
at updateBest (file:///D:/Users/user/node_modules/roadroller/index.mjs:1386:26)
at updateBestAndReportProgress (file:///D:/Users/user/node_modules/roadroller/index.mjs:1397:56)
at file:///D:/Users/user/node_modules/roadroller/index.mjs:1524:26
at evaluate (file:///D:/Users/user/node_modules/roadroller/index.mjs:1436:31)
at search (file:///D:/Users/user/node_modules/roadroller/index.mjs:1446:51)
if i use lower M or run with -O1 it works file.
Currently Roadroller gives you a JS code which decompresses into the original input when run. Normally you then have to put it into the template that is not compressed; this is trivial in the API but not in the CLI, so we need CLI options to prepend and append arbitrary uncompressed data to the output. Depending on the design there may be several options.
-t asis
: Any input with this type would be printed as is, with implied -a write
. This fits well with multiple inputs (#8) but there is a possible ambiguity. For example -t asis one.html -t js a.js -t asis two.html -t js b.js -t asis three.html
suggests that a.js
and b.js
is separated with two.html
, but this might not be desirable or even impossible depending on the implementation.
-t before|after
: These options ensure the uncompressed input to go either before or after the compressed data. I'm not sure how to handle multiple of them though.
-B|--before
and -A|--after
: Similar to the previous but it is a bit clearer because they are technically not inputs but output transformations (the CLI has an implicit convention that lowercased options are related to inputs and uppercased options are related to outputs).
-T|--template
: "Before"/"after" inputs would be related, so a single option might be more appropriate. It is also free from multiple uncompressed inputs (we can simply forbid multiple -T
options). A mark where the compressed JS goes should be chosen however.
Also unlike most compressed inputs, uncompressed inputs can be as short as <script>
and </script>
so ideally it should be possible to give them directly from the command line.
Roadroller currently uses simulated annealing for sparseSelectors (which have a large multidimensional space) and a modified binary search for all other parameters (linear). It works relatively well, but there might be a better way to optimize them altogether.
It was observed that sparseSelectors search results wildly vary across runs, so we want to reduce a possibility to run into "unhelpful" neighbors. I've already tested a form of tabu search in the past but it didn't work well---sparseSelectors have a large number of possible neighbors so we could only mark a small portion of them over the search.
A modified binary search is based on the assumption that the size function to be minimized is convex and it has a relatively large "bowl" across the entire range. Both are only approximately true and sometimes the search misses better parameters. The biggest reason is that we don't have a derivative of the size function available, so we can only guess it from a limited number of evaluations.
In any case, we want a metaheuristic algorithm that:
The current quote modelling is all or nothing, but we can model some quotes (like '
) and ignore others (like `
). Or we can give different context hashes for different quotes.
The current output format is fixed to the six-bit code, which is optimized for js13kGames submissions. There are numerous other possibilities though and I'd like to reserve the -F|--output-format
option in the CLI for this purpose. For the API it would be probably new method to be used instead of the current Packer.makeDecoder
.
The default would remain as the six-bit coding, an explicit option being -F6
. The first digit denotes outBits
which is the number of bits per each coded unit. There may be following characters if there are multiple output formats with the same outBits
or it requires a manual intervention (therefore we would probably never have -F8
).
Each possible future format would go into there own issues. In this issue I'd document the current six-bit coding.
-F6
: Six-bit codingThis is the default and only possibility as of 2.1.0. The compressed data is encoded into a template literal, where a code unit k is encoded as a code point k or k + 0x40 so that the escape sequence is not required. The latter is preferred since they coincide with alphabets, but the former is desirable or even required when k is one of the following:
\
)(
, the first line has to include this character when -D
is not in use)=
, the first line has to include this character when -D
is in use)This format is designed to be "compressed" again with the Huffman tree. This is ensured by making every code unit corresponds to a (mostly consecutive) set of 64 characters and minimize the number of characters not in the set (we can't completely get rid of them because `
has to be somewhere in that line). The Huffman tree needs to encode characters not in the set so the actual coding rate is slightly below 8 bit/char even after DEFLATE, but the optimal tree should result in at least 7 * (384/385) = 7.979 bit/char.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.