Giter Site home page Giter Site logo

small-jsfuck's People

Contributors

kamil-kielczewski avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

small-jsfuck's Issues

Checek and change digits jsf representations

After small statistical experimets it looks like for average code after conversion to base4 the most common character is 1,0,2 and less common is 3.

Check this on popular codefiles (like jquery etc.) - if true, then switch numbers representation to

0 --> 0       jsf:    +!![]
1 --> false   jsf:     +![]
2 --> true    jsf:   +(+[])
3 --> 1       jsf: +(+!![])

This will cause that in bootstrap we need to add .replace(/1/g,3) (because before 0 and 1 have same representation as in jsf) as follows

.replace(/true/g,2).replace(/false/g, 1).replace(/1/g,3)
  1. check how typical 1kB of code is smaller after this change
  2. check how much bootstrap code grows (if only few kB and point 1 is nice, then it is worth to make that change )

Procedure for generate text statistics

Create reusable function which allow to prepare encoded text statistic :

  1. text will be encoded to base4, base8 or base9 (maybe base16)
  2. procedure count popularity of baseX each character and return output as array of baseX characters sortet from most to lest popular

This is already done for base8 in this fiddle

Alternative coding base8

Instead of #4 approach we can try to use following modification - we assume that each character have 4-digit representation (we use padding 0 - details here)

So we can change this '\141\154\145\162\164\50\61\51' to this '\141\154\145\162\164\050\061\051' -and now we don't need backslashes at all (in base8 code represenation now we use 8 digit - not 9 like before)

eval(eval("'\\"+ "141154145162164050061051".match(/.../g).join("\\")+"'"))

We need to perform similar investigation like for #4.

Bootstrap: we can use above approach using current small-jsfuck algorithm but whitch toString(8) and parseInt(x,8

add Tool to fast rebuild deconverter

write some tool which allow build devonverterer in faster way. This tool can be written in jsfiddle.net (but final code should be copied here)

Alternative coding base9

Again we go back do idea from #4 but now the goal is not best compression ratio but shorter decoder - which will be used for small codes - 3 versions to check (last one have best compression ratio)

eval(eval("'91419154914591629164950961951'".replace(/9/g, "\\")))

eval(eval("'false141false154false145false162false164false50false61false51'".split("false").join("\\")))

eval(eval("'\\"+"141154145162164050061051".match(/.../g).join("\\")+"'"))

So we actually don't want to use shortest available jsf representation for each number - each number will be directly represented by jsf number code. This allow to short decoder (with cost of lower "compression" ratio)

Add option to deconvert with non-deprecated functions

(un)escape - gate for all

Having following characters: 123456789 aceflnmoprstu (which can be acheved one by one) we are able to get other lower/upper case letters and some characters without using deprecated methods like italics. Deprecated methods was used in jsfuck for size-optimizations. To avoid them we can use escape and unescape methods - technique base on this question and answer. We can do it by e.g. for letter C (which has hexadecimal escape code 43) as follows (we show 5 steps of formula evolution towards jsf)

step1:  unescape("%43")
step2:  unescape(escape(" ")[0]+43)
step3:  unescape(escape((NaN+[]["flat"])[11])[0]+43)
step4:  Function("return unescape")()(Function("return escape")()(" ")[0]+43)
step5:  []["flat"]["constructor"]("return unescape")()([]["flat"]["constructor"]("return escape")()((NaN+[]["flat"])[11])[0]+43)

Using this approach we have access to: !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_ abcdefghijklmnopqrstuvwxyz{|}~ + tilde character and more. Using this technique we don't need to use String.fromCharCode (which in old approach forced to use deprecated methods like 'italics' or 'fontcolor' etc.)

For letter "C" we can also use below shortcut based on escape only discovered by Siguza

step1: Function("return escape")()(",")[2]
step2: []["flat"]["constructor"]("return escape")()([[]]["concat"]([[]])+[])[2]

Give "checbox" which will alow to compile using deconverter based on above non-deprecated functions

List of all characters:

[...Array(256)].map((x,i)=> unescape(`%${i.toString(16)}`))

Alternative coding

Check octal coding idea proposed by jsfuck author aemkei here:

I like the idea of encoding the characters into numbers in a bootstrap to save space.

Have you thought about using octal sequences? This would save some some space per characters:

EG:

eval(eval("'91419154914591629164950961951'".replace(/9/g, "\")))
The bootstrap code is ~25k but maybe we can save some bytes by replacing the quotes or backspace.

'\141\154\145\162\164\50\61\51' in chrome console gives "alert(1)"

Check if this works with emoji/Chinese letters

Emoji and Chinese letters

Currently emoji are encoded in wrong way - check why and find solution (but if it increase bootstrap code, then add proper "checkbox" to use it). If this will casue "big problems" then set as "wontfix" (because emoji escape characters are supported in JS and works e.g. alert("\u{1f601}") ).

TMP

this issue is only for tempoprary work...

Design archtecture of future relase of compiler

In futre release user will:

  1. Put his code
  2. select to compile code using deprecated (small), partial-deprecated (medium), non-deprecated (bigest) version of decoder.
  3. small-jsf, depending on code size will choose method (base4, base8 and base9 - and maybe base 16)
  4. small-jsf wil use text statistics #10 to use shortest jsf representation for most popular baseX characters
  5. show output cod to user and allow to download as .js file

Compiler shoud be written in such way that replacing jsf-representation for decoder schould be automatic and easy (this is done for base8 here

Optimisation - statistic analise of input code

Optimisation - lets analyse which characters are most used in input code - and base on that prepare proper map base4 to jsf (look to #1) - in this case decompresor shoud be constructed in dynamic way (or we can have hardcoded 4! permutations) - but the main decision is: which num for shortes representations is better 1 or 0, and sequenc for two larger representation 2,3 - so we need 4 decompresor variants (this also can play important rule if base8 will be shorter #6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.