Giter Site home page Giter Site logo

Comments (67)

evanw avatar evanw commented on May 5, 2024 104

This is definitely something I plan to get to because I want to be able to use it myself. Right now import(path) turns into Promise.resolve().then(() => require(path)) so dynamic imports still "work" although they don't result in additional bundles. In the future it will generate separate bundles. I may also add support for common chunk and/or more advanced shared dependency analysis.

from esbuild.

evanw avatar evanw commented on May 5, 2024 74

I don't have a specific date but I'm currently focused on a rewrite of the bundler to enable code splitting, tree shaking, ES6 module export, and a few other features. I have to do these together because they are all interrelated.

I've done the R&D prototype to prove it out and I've settled on an approach. I'm currently working on doing the rewrite for real on a local branch. There's still a lot left to do to not break features I've added in the meantime (stdin/stdout support, transform API, etc) so it'll take a while. I have a lot of test failures to work through :)

I was worried about the performance hit because the graph analysis algorithms inherently reduce parallelism, but some early performance measurements seem to indicate that it won't slow it down that much, if any. I hope to ship this sometime in the next few weeks. We'll see how it goes!

from esbuild.

evanw avatar evanw commented on May 5, 2024 36

I just released version 0.5.15 with an experimental version of code splitting. See the release notes for details. It's still a work in progress but it's far enough along now that it's ready for feedback. Please try it out and let me know what you think.

from esbuild.

weilandia avatar weilandia commented on May 5, 2024 27

Trying to track down a recent update on "Code splitting is still a work in progress. It currently only works with the esm output format. There is also a known ordering issue with import statements across code splitting chunks."

Is a fix for these issues planned?

from esbuild.

jpmaga avatar jpmaga commented on May 5, 2024 19

@evanw do you have any kind of roadmap somewhere for esbuild? I am particularly interested in this feature, and would be cool to know where it is in terms of planning. Cheers.

from esbuild.

overlookmotel avatar overlookmotel commented on May 5, 2024 18

I'm late to the party on this one, but a few thoughts on chunk splitting strategies...

Code examples below use the convention that a shared chunk which contains code required by entry points "a", "b" and "c" is named "a_b_c". ESBuild actually uses content hashes for filenames, but I'm ignoring that for now. Hopefully it makes sense.

OK, going back to basics for a minute...

1. The penalty of many chunks

In my (basic) understanding, if you're serving files over HTTP/2, the penalty for an app being split into a large number of chunks is minimal, as long as the import statements for the shared chunks appear in the entry point chunks, not nested within other shared chunks.

This will only only require 2 round-trips to the server - 1 to fetch a.js, and a 2nd to fetch a_b.js, a_c.js and a_b_c.js in "parallel":

// a.js
import ab from './a_b.js';
import ac from './a_c.js';
import abc from './a_b_c.js';
// Now do stuff with them

Whereas this will take 3 round trips, as browser has to wait for a_b.js or a_c.js to arrive before it knows it needs a_b_c.js:

// a.js
import ab from './a_b.js';
import ac from './a_c.js';
// Now do stuff with them

// a_b.js
import abc from './a_b_c.js';
// Now do stuff with it

// a_c.js
import abc from './a_b_c.js';
// Now do stuff with it

2. The advantage of many chunks

If a chunk is small, its content is less likely to change. Therefore (1) it will remain cached for longer and (2) it will rarely change filename due to content change, and so rarely cause cascading changes in files which import it.

3. Why do people want manual control over chunks?

I suspect the main reason is this along these lines:

  1. Your app has 2 pages, home and about.
  2. You use React.
  3. You very rarely update React, but the rest of your app changes often.
  4. ESBuild cunningly recognises that only home uses React.useState and only about uses React.useEffect. It puts useState in the home.js chunk, and useEffect in the about.js chunk. Great!
  5. You update the code for home.
  6. The home.js chunk is changed and the browser needs to download it again. It also ends up downloading the code for React.useState again, even though it's not changed, because that's bundled in the home.js chunk.
  7. You are unhappy. You say "I wish I had manual control over the chunks so I could stop this madness!"

I contend: The user asking for manual control has correctly identified the problem, but not necessarily the best solution.

Rather than allowing the user to say "I want all of React in one chunk", you could allow them to say "I don't want React to be mixed into a chunk containing other non-React code".

The 1st gives you:

// home.js
import {useState} from './react.js';

// about.js
import {useEffect} from './react.js';

// react.js
export function useState() {}
export function useEffect() {}

home.js and about.js are both importing code they never use.

The 2nd would result in:

// home.js
import {useState} from './home_react.js';

// about.js
import {useEffect} from './about_react.js';

// home_react.js
export function useState() {}

// about_react.js
export function useEffect() {}

The two entry points only download the code they need, but home_react.js and about_react.js will not change unless React is updated, so they can be cached for longer.

This would actually be quite easy to implement without changing ESBuild's splitting algorithm. You just need to introduce a pseudo-entry point import * from 'react', and ESBuild will split the chunks as above.

I'm not convinced that most users really do want manual control. I think most would prefer the build tool to do everything for them if it can do it as well as they could.

4. Conclusions

ESBuild's algorithm very cleverly produces the optimum split chunks for a given set of code. It does not, however, optimize for:

  1. how the code is loaded and
  2. caching - reducing chunk churn as the codebase changes over time

(1) refers to my example in "The penalty of many chunks" above. Hoisting all imports to the entry chunks might require more bytes of output, but it might still load faster due to less round-trips to the server.

The beginnings of a possible solution to (2) is discussed above. You could even take a more radical approach and aim to produce as many chunks as possible. That way, caching will be even more durable, at the cost of a huge number of chunks.

However, that does assume HTTP/2 and therefore that producing loads of chunks is not a problem. A really good splitting strategy would be adaptable for the circumstances of either HTTP/1 or HTTP/2.

Perhaps all cases could be covered with two user settings:

  1. List of modules unlikely to change often, in priority order
  2. Limit on total number of chunks / limit on number of chunks per entry point (e.g. no more than 5 imports for any entry point)

ESBuild would split the code into chunks based on these 2 constraints: start with an unlimited number of chunks, and combine them as required to hit the desired number, guided by the list of what is most important to split off.

Sorry that's really long. It's a very interesting area!

from esbuild.

evanw avatar evanw commented on May 5, 2024 17

I have a small progress update on code splitting. From the release notes for the upcoming release (not out yet):

Code that is shared between multiple entry points is separated out into "chunk" files when code splitting is enabled. These files are named chunk.HASH.js where HASH is a string of characters derived from a hash (e.g. chunk.iJkFSV6U.js).

Previously the hash was computed from the paths of all entry points which needed that chunk. This was done because it was a simple way to ensure that each chunk was unique, since each chunk represents shared code from a unique set of entry points. But it meant that changing the contents of the chunk did not cause the chunk name to change.

Now the hash is computed from the contents of the chunk file instead. This better aligns esbuild with the behavior of other bundlers. If changing the contents of the file always causes the name to change, you can serve these files with a very large max-age so the browser knows to never re-request them from your server if they are already cached.

Note that the names of entry points do not currently contain a hash, so this optimization does not apply to entry points. Do not serve entry point files with a very large max-age or the browser may not re-request them even when they are updated. Including a hash in the names of entry point files has not been done in this release because that would be a breaking change. This release is an intermediate step to a state where all output file names contain content hashes.

The reason why this hasn't been done before now is because this change makes chunk generation more complex. Generating the contents of a chunk involves generating import statements for the other chunks which that chunk depends on. However, if chunk names now include a content hash, chunk generation must wait until the dependency chunks have finished. This more complex behavior has now been implemented.

Care was taken to still parallelize as much as possible despite parts of the code having to block. Each input file in a chunk is still printed to a string fully in parallel. Waiting was only introduced in the chunk assembly stage where input file strings are joined together. In practice, this change doesn't appear to have slowed down esbuild by a noticeable amount.

from esbuild.

ephemer avatar ephemer commented on May 5, 2024 13

Hi there!

I'm just wondering if code splitting with IIFE is still interesting to @evanw and others. As much as we'd love to use esm for our browser output, we currently need to use IIFE due to naming conflicts with global variables in external library code. I assume our case is fairly common, because esbuild's default for target: "browser" is to use IIFE. I also assume that browser output would stand to gain the most benefit from code splitting compared to other environments.

tl;dr I'm wondering if there's a way to use esbuild differently to get code splitting working today (for example somehow working around the conflicts in the global namespace and using esm directly) and if not whether the iife format is still destined to receive the code splitting kiss of life at some point. I know @evanw has mentioned a couple of times that it's on the cards but the last time was in November and things can certainly change over time.

I do want to add that I understand this is just one of many features and bugs on the esbuild radar, so would totally understand any answer here or none at all. In any case, thanks so much for esbuild, it is an awesome piece of software engineering and such a breath of fresh air to the ecosystem.

from esbuild.

evanw avatar evanw commented on May 5, 2024 12

Do you have any news about this?

It's mostly working already. The chunk splitting analysis has already landed. All that's left is to bind imports and exports across chunks. I'm working on that in a branch and this will be my main focus soon.

from esbuild.

evanw avatar evanw commented on May 5, 2024 11

Another code splitting update:

I finally got around to implementing per-chunk symbol renaming, which I view as required for the code splitting feature. I've made several attempts at this in the past but I haven't landed them because I don't want to severely regress performance (or memory usage, which I've started to also pay attention to). I finally figured out a good algorithm for doing per-chunk symbol renaming that's fast and parallelizable while not using too much memory. It's actually two algorithms, one when minifying and a different one when not minifying.

From the release notes:

Previously, bundling with code splitting assigned minified names using a single frequency distribution calculated across all chunks. This meant that typical code changes in one chunk would often cause the contents of all chunks to change, which negated some of the benefits of the browser cache.

Now symbol renaming (both minified and not minified) is done separately per chunk. It was challenging to implement this without making esbuild a lot slower and causing it to use a lot more memory. Symbol renaming has been mostly rewritten to accomplish this and appears to actually usually use a little less memory and run a bit faster than before, even for code splitting builds that generate a lot of chunks. In addition, minified chunks are now slightly smaller because a given minified name can now be reused by multiple chunks.

from esbuild.

evanw avatar evanw commented on May 5, 2024 10

does that imply that circular references cannot be built (since each file contains a reference to another)?

Yes, code splitting currently generates an acyclic module graph.

The current automatic code splitting algorithm makes sure that a) a given piece of code only ever lives in one chunk and b) a given entry point doesn't import any code that it won't use. This means it generates one chunk for each unique overlap of entry points. So if there are three entry points A, B, and C, that means there could potentially be up to 7 chunks: A, B C, A+B, A+C, B+C, and A+B+C. The chunk for A would only include code accessible by A but not by B or by C, the chunk A+B includes all code accessible by A and B but not by C, and the chunk A+B+C is for all code that is used by all entry points. Because of this structure, cyclic imports are not ever generated by construction. Two chunks wouldn't ever need to import each other because if they do reference each other, they would be considered a connected component in the graph and would have been written out as part of the same chunk.

This automatic algorithm was a good experiment but it has some drawbacks. The main drawback is just that it's automatic. Many people want to have control over the algorithm in various ways. With many entry points, I'm sure you can see how the current algorithm can potentially generate a lot of chunks due to the combinatorial explosion of overlaps. People familiar with ESM have said that this is fine since the browser can handle a lot of chunk files (>100). Other people are turned off by the idea of having lots of generated chunks and have been requesting manual control over chunk files. Potentially people are just more used to fewer chunks from Webpack setups with manual chunk generation and lack of HTTP/2. I'm not sure what to think about the trade-offs between these approaches because I haven't done extensive performance analysis myself.

To implement manual chunk assignment you would two things:

  1. You would need the ability to include code in the bundle that's guaranteed to never be used.

    For example, people may want to direct esbuild to turn a whole library into a single chunk even though not all of that library is used by all entry points. This will result in dead code. Right now this is impossible because esbuild's tree shaking algorithm automatically removes dead code. I'm currently designing a different linking model that will allow for keeping dead code while still keeping most of the benefits of ESM's static binding. It involves making module execution lazily-evaluated while still keeping module binding eagerly-evaluated. I'm not sure if this approach will work out but it seems hopeful.

  2. You would need the ability for chunks to potentially participate in an import cycle.

    Manual chunk assignment means esbuild can't generate an acyclic graph since code in a connected component may have multiple different manual chunk labels. I can think of two ways of linking cyclic chunks together. One way is to use dummy text for import paths, calculate all of the file hashes, then swap the dummy text for the real import paths. The file hashes will be "wrong" in that they won't be a hash of the ultimate file contents, but presumably it'd still be ok for cache invalidation as long as you mix in the hashes of all files involved in a cycle with each other. The other way is to pull out the hashes into an import map. That adds a level of indirection between the import paths and the actual hashed file names. It can lead to better caching because changing a dependency doesn't involve also changing the dependents, but import maps aren't a part of the web platform yet so this approach is presumably not viable for a while.

That's where my thinking is at the moment. I'm currently in the design phase for the next version of code splitting. The next iteration should hopefully finish the code splitting feature. I want to address the current known import ordering bug, get code splitting working for the cjs and iife formats, and potentially also implement manual chunk assignment. And it'd be really great to do top-level await too, although I may punt on that.

Edit: part of why I'm posting this is that I'm curious what people think about the path embedding approach vs. the import map approach.

from esbuild.

evanw avatar evanw commented on May 5, 2024 9

Simple CSS support is the next main thing we are eaglerly looking forward to.

You and me both! CSS support is currently the next major feature I want to implement after code splitting. That’s tracked by a separate issue, however: #20.

from esbuild.

evanw avatar evanw commented on May 5, 2024 8

@evanw it would be very interesting if you could expand somewhere on the exact symbol naming technique you converged on here.

I just wrote up some documentation about the parallel symbol minification algorithm here.

The non-minified symbol renaming algorithm isn't described in the docs yet but it's pretty simple. Just rename symbols to avoid collisions by appending an increasing number to the name until there's no longer a collision. Each symbol will need to check for collisions in all parent scopes. Symbols in top-level scopes must be renamed in serial but symbols in nested scopes can be renamed in parallel.

from esbuild.

evanw avatar evanw commented on May 5, 2024 7

No, supporting these expressions by default is out of scope: https://esbuild.github.io/api/#non-analyzable-imports. You can either support them by switching on language and returning a statically-determined import() expression based on the value of that variable, or use another bundler such as Webpack that does this. In the future, it may be possible to handle code like this with a plugin, but that doesn't currently work.

from esbuild.

jbms avatar jbms commented on May 5, 2024 6

The lack of support for IIFE is also particularly unfortunate because currently Firefox does not support esm for web workers, which means esbuild code splitting cannot be used for web workers.

from esbuild.

evanw avatar evanw commented on May 5, 2024 6

Code splitting with dynamic import() of a JSON file that has key names at the root that would not be valid JavaScript identifiers yields incorrect named exports:

// a.json
var x_y = "foo";
var a_default = { "x-y": x_y };
export {
  a_default as default,
  x_y as "x-y"
};

This is perfectly valid JavaScript. It uses a new JavaScript syntax feature called Arbitrary Module Namespace Identifiers. I can understand the confusion because this feature was somehow added even though it bypassed the TC39 proposal process, and was therefore not ever really announced despite being a significant addition to the language. But it has already been added to the ECMAScript specification and support for it has shipped in Chrome 90+, Firefox 87+, and node 16+. It's a real JavaScript language feature. As with all new JavaScript language features, you need to make sure to set esbuild's --target= setting appropriately to tell esbuild to not use syntax features that are newer than what your target environment supports. For example, if you pass --target=node14 the x-y export will not be generated.

I propose to simply not try and export those fields separately.

It's true that this is a bundler-specific extension, and not part of a standard. Node doesn't behave this way for example. But it's a useful extension because it lets you import specific fields from the JSON file without importing the whole thing. For example, you can import { version } from './package.json' and all fields except version will be tree-shaken away. With the Arbitrary Module Namespace Identifiers feature you can also import { 'x-y' as x_y } from './a.json' if you need to.

from esbuild.

ponsifiax avatar ponsifiax commented on May 5, 2024 5

Hello here,
Do you have any news about this?
That the last feature to use it on production 👍

from esbuild.

The-Code-Monkey avatar The-Code-Monkey commented on May 5, 2024 5

@evanw i have a different use-case from above, i have this dynamic import

const getIcon = (name) => {
  return lazy(() => import(`./icons/${name}`));
};

problem is that esbuild is bundling all the icons into the index.js file rather than keeping them in a separate file. is it possible to say this should stay as it is.

from esbuild.

millsp avatar millsp commented on May 5, 2024 4

Initially meant to be posted here #1341

In my case, I needed to bundle dependencies and chunk them, while keeping the output files separate (not one big bundle). All that works except that only format: 'esm' is currently supported, so I wrote a plugin to transpile to CJS again 🙃.

Definitely not ideal, I can live with it.

export const esmSplitCodeToCjs: esbuild.Plugin = {
  name: 'esmSplitCodeToCjs',
  setup(build) {
    build.onEnd(async (result) => {
      const outFiles = Object.keys(result.metafile?.outputs ?? {})
      const jsFiles = outFiles.filter((f) => f.endsWith('js'))

      await esbuild.build({
        outdir: build.initialOptions.outdir,
        entryPoints: jsFiles,
        allowOverwrite: true,
        format: 'cjs',
        logLevel: 'error',
      })
    })
  },
}

from esbuild.

matthiasg avatar matthiasg commented on May 5, 2024 3

Works really well in initial testing. We will test more complicated setups (rush repo, nested pnpm deps) more fully in the next weeks

from esbuild.

arcanis avatar arcanis commented on May 5, 2024 3

I'm currently working to adapt our large codebase so that it compiles with esbuild, but I'm still unsure what's the best path to production given the lack of IIFE bundle splitting. The current options I see are:

  • Don't use bundle splitting, which isn't an option since we're talking about a very, very large bundle.
  • Don't use esbuild in production, which we'd like to avoid since it could lead to differences in prod and dev behaviours
  • Post-process the ESM bundle splitting with something like rollup-plugin-iife, which would require to parse the bundle and make transformations after ESBuild generated the ESM chunks. This is likely the option we'll end up using, but I suspect it'll affect the build performances quite a lot. With esbuild being so fast, postprocessing seems likely to end up the bottleneck...!

Which options have people picked so far and what were the results? @evanw is IIFE bundle splitting still on the roadmap, and is there anything a company could do to help (through sponsorship or external contributions, perhaps)?

from esbuild.

evanw avatar evanw commented on May 5, 2024 2

You have to explicitly enable code splitting with --splitting. It's not enabled by default. When enabled, every target of an import() expression is considered to be an entry point. So you should be getting multiple output files in this case. Also code sharing should kick in since then there are multiple entry points.

from esbuild.

stefanoverna avatar stefanoverna commented on May 5, 2024 2

Hi, I'm curious to know if dynamic expressions in import()s are currently working, and if there's plan to support them

import(`./locale/${language}.json`).then((module) => {
  // do something with the translations
});

I've done some quick tests and it seems that the ./locale/${language}.json part is left unchanged, while with regular imports it correctly modifies the path to include ie. [chunk].

Thanks for your beautiful work!

from esbuild.

hyrious avatar hyrious commented on May 5, 2024 2

@mattfysh If you're using node.js to run this code, then this is because node.js ignores the module field. More details at doc: main-fields. You can import the ESM file directly with:

import {parse} from "css-what/lib/es/index.js"

If you're using esbuild to bundle the code with --platform=node, the reason is the same because esbuild is trying to behave the same as node.js. You can try to add --main-fields=module,main to your build script.

For package authors: If you really want users to use the native ESM way to use your package, at least do this:

"exports": {
	"node": {
		"import": "./dist/index.mjs",
		"require": "./dist/index.js"
	},
	"default": "./dist/index.mjs"
}

More details at doc: how-conditions-work.

from esbuild.

jpmaga avatar jpmaga commented on May 5, 2024 1

I don't have a specific date but I'm currently focused on a rewrite of the bundler to enable code splitting, tree shaking, ES6 module export, and a few other features. I have to do these together because they are all interrelated.

I've done the R&D prototype to prove it out and I've settled on an approach. I'm currently working on doing the rewrite for real on a local branch. There's still a lot left to do to not break features I've added in the meantime (stdin/stdout support, transform API, etc) so it'll take a while. I have a lot of test failures to work through :)

I was worried about the performance hit because the graph analysis algorithms inherently reduce parallelism, but some early performance measurements seem to indicate that it won't slow it down that much, if any. I hope to ship this sometime in the next few weeks. We'll see how it goes!

Damn! You're the man. This is the only thing I am missing to start using it in production, in smaller projects for starters, and see how it goes. PS: Have tested a couple locally, without code splitting, and everything worked flawlessly, even in one with a fairly large codebase using react and typescript. 👍

from esbuild.

matthiasg avatar matthiasg commented on May 5, 2024 1

@evanw Thanks a lot for this detailed write-up ! This is the kind of information required for using a tool such as this.

from esbuild.

mtsewrs avatar mtsewrs commented on May 5, 2024 1

@evanw Do you plan on supporting code splitting with other formats apart from esm?

from esbuild.

sachinahya avatar sachinahya commented on May 5, 2024 1

Does anyone else do this and is there a real benefit in it? Feels weird to have esbuild squishing everything into a one big js file.

The main benefit comes when you include content hashes in the output filenames and configure your server with long term caching headers. Every time you build, any chunks that don't change will retain the same filename so that users who already have cached copies can continue to use those and only redownload the updated chunks. Generally vendor chunks are much larger and change less frequently than your app chunks.

from esbuild.

jpike88 avatar jpike88 commented on May 5, 2024 1

@evanw any ideas of feasibility, difficulty, timeline of vexing able to produce a vendor bundle? I’d love to help out but my Go ability is nonexistent

from esbuild.

hyrious avatar hyrious commented on May 5, 2024 1

@pft Dynamic imports:

var { "x-y": x_y } = await import("./b.mjs")
console.log(x_y)

from esbuild.

pft avatar pft commented on May 5, 2024 1

@hyrious Wow. And, just for completeness sake, this works too:

import("./a.js").then(({"x-y": x_y}) => console.log(x_y));

from esbuild.

hyrious avatar hyrious commented on May 5, 2024 1

@pablo-mayrgundter You need to use a newer target which supports dynamic import (import()), simply edit that field to esnext or something else.

from esbuild.

snuricp avatar snuricp commented on May 5, 2024 1

@evanw whats the current status on enabling code splitting with cjs format?

from esbuild.

andrewvarga avatar andrewvarga commented on May 5, 2024

This would be awesome to have!

from esbuild.

garygreen avatar garygreen commented on May 5, 2024

Excellent news! Thank you for all your hard work on this Evan. Code splitting was vital for us. Does this code splitting feature split css imports into seperate files and add at runtime? Simple CSS support is the next main thing we are eaglerly looking forward to.

from esbuild.

evanw avatar evanw commented on May 5, 2024

That's great to hear! Thanks so much for trying it out.

from esbuild.

guybedford avatar guybedford commented on May 5, 2024

@evanw it would be very interesting if you could expand somewhere on the exact symbol naming technique you converged on here. I'm sure it will make sense looking at the outputs too though of course.

from esbuild.

evanw avatar evanw commented on May 5, 2024

@evanw Do you plan on supporting code splitting with other formats apart from esm?

Yes, that's why this issue is still open. However I want to fix issues with the current esm code splitting first: #399.

from esbuild.

DanielHeath avatar DanielHeath commented on May 5, 2024

If the file contents are included in the hash, does that imply that circular references cannot be built (since each file contains a reference to another)?

Or is the hash calculated before rewriting the imported filenames?

from esbuild.

DanielHeath avatar DanielHeath commented on May 5, 2024

One way is to use dummy text for import paths, calculate all of the file hashes, then swap the dummy text for the real import paths

I think that makes the most sense, though the dummy text needs to be somehow derived from the path so that importing a different path generates a new fingerprint.

So if there are three entry points A, B, and C, that means there could potentially be up to 7 chunks: A, B C, A+B, A+C, B+C, and A+B+C

More confusing yet - if you have loaded chunk A, then navigate to an area that needs B, you could reasonably want B - A in order to avoid re-fetching libraries used by both.

The chunk-splitting API I would like to use looks something like:

(path: string, suggestedLocations: Array<string>): Array<string>

For instance, if node_modules/react/index.js were passed as the path you could return ["react"] to indicate that esbuild should generate a file output/react-fingerprint.js; all entrypoints that require react will need to reference that file in their HTML.

suggestedLocations would be everywhere the chunk is currently getting written to - eg A, B C, A+B, A+C, B+C, and A+B+C.

from esbuild.

evanw avatar evanw commented on May 5, 2024

More confusing yet - if you have loaded chunk A, then navigate to an area that needs B, you could reasonably want B - A in order to avoid re-fetching libraries used by both.

That is the point of splitting up code like this. Sorry, using the same letter for an entry point input file and the corresponding output chunk was confusing.

Let's say the entry points are lower-case letters a, b, and c. Chunk B only includes code for entry point b but not entry points a or c. A given piece of code only ever ends up in one chunk so B - A (and really any intersection between any two chunks) is the empty set.

Code shared between entry points a and b (but not with entry point c) is placed in chunk A+B. Chunk A would import chunks A+B, A+C, and A+B+C to get all the code needed by entry point a. Chunk B would import chunks A+B, B+C, and A+B+C. When you move from chunk A to chunk B, the browser would avoid re-fetching the chunks A+B and A+B+C since it has already fetched them. The browser would only need to download chunk B and B+C (which represents entry point b - a). This would be more clear as a Venn diagram...

you could return ["react"] to indicate that esbuild should generate a file output/react-fingerprint.js

I think this is similar to the design I'm thinking of. You can return an optional manual chunk name from your plugin and if it is present, all code with that same manual chunk name will be forced to be in the same chunk and all code without that manual chunk name would be forced to be in some other chunk.

Right now I'm thinking that tree shaking would still be active for manual chunks, although it would only remove code that isn't used by any entry point. This will likely often result in dead code in your bundle because if a shared library is assigned to a manual chunk, all entry points which use that library would be forced to load all code in that library used by any entry point, even if it's only ever used by one entry point. Manual chunking will prevent esbuild from inlining code from that library that's only used by a single entry point directly into that entry point itself.

from esbuild.

arcanis avatar arcanis commented on May 5, 2024

Is it correct that bundle splitting only works at the moment for modules shared between multiple entry points? I think I'm hitting the first case defined in the OP: a single-page application, with many asynchronous imports when switching pages. No chunks are generated, causing the output to be super-large.

from esbuild.

DanielHeath avatar DanielHeath commented on May 5, 2024

Right now I'm thinking that tree shaking would still be active for manual chunks, although it would only remove code that isn't used by any entry point. This will likely often result in dead code in your bundle because if a shared library is assigned to a manual chunk, all entry points which use that library would be forced to load all code in that library used by any entry point, even if it's only ever used by one entry point. Manual chunking will prevent esbuild from inlining code from that library that's only used by a single entry point directly into that entry point itself.

This is why I think it's valuable to be able to (manually or otherwise) assign code to multiple chunks. If a library appears in multiple chunks, tree-shaking can be applied to each to remove the unused parts; this allows you to import a small function from a large library.

from esbuild.

evanw avatar evanw commented on May 5, 2024

This is why I think it's valuable to be able to (manually or otherwise) assign code to multiple chunks. If a library appears in multiple chunks,

I'm not quite sure what you're saying. It might be the case that the upcoming version of esbuild's code splitting will already do what you're saying.

  1. You could be saying that identical copies of the same piece of code could be present in multiple chunks. This is incompatible with how esbuild's code splitting works. When code splitting is enabled, a given piece of code must only ever live in a single chunk. If code lived in multiple chunks and those chunks were loaded simultaneously, you'd get bugs because two copies of a module would be loaded at the same time which isn't supposed to happen.

    You can only get duplicate copies of code in separate output files when code splitting is off. But that's because then each output file is completely self-contained and never imports any other output files.

  2. You could be saying that different pieces of code from the same library (e.g. npm package) are present in different chunks, without duplication. That's how esbuild's code splitting already works. Different files from the same library can end up in different chunks. Actually (and this is different than other bundlers) different pieces of code in the same file can also end up in different chunks. With esbuild's automatic code splitting, file boundaries don't really matter for side-effect free code. Files are automatically split up into pieces and each piece can potentially live in a separate chunk. You can read more details here.

    This is important to point out because manually assigning a file as a single chunk the way I've envisioned it (when the manual chunk assignment feature is released) will actually prevent this optimization, since the assignment tells esbuild to put all code in that file in the same chunk. This will potentially improve caching if you anticipate future code using more code from that library, but otherwise doing this just creates dead code that is downloaded only to be unused. So manual chunk assignments are potentially a foot-gun.

Maybe you could point me to some resources that describe this more if you're talking about existing behavior from other bundlers?

tree-shaking can be applied to each to remove the unused parts; this allows you to import a small function from a large library.

This is always the case because esbuild's tree shaking is always active and can't be disabled. Side-effect free ESM code that is never used will always be dropped regardless of what library it's in. Basically the automatic code splitting settings should do this fine.

If you start manually assigning chunks you could end up causing some dead code in some cases. Specifically, if all entry points use different part of the same library and you assign all code in that library to the same manual chunk, all entry points will have to pay the cost of downloading all code in that library that any entry point uses (tree shaking is still active so parts of the library that none of the entry points use will still be removed).

If you didn't use a manual chunk assignment, then esbuild would automatically compute optimal chunk boundaries for shared code resulting in no dead code. If all entry points use different non-overlapping parts of that library, you could even get no shared chunk at all because only the relevant code will have been inlined directly into the respective entry point chunks.

from esbuild.

DanielHeath avatar DanielHeath commented on May 5, 2024

I was suggesting 1, on the basis that most libraries are stateless, and if you customize your config to have a stateful library appear multiple times, you are inviting bugs.

However, since you've implemented splitting within a file, I think it's not required at all. The situation where two entrypoints load overlapping-but-largely-distinct subsets of a large library is pretty damn hard to hit.

It's not tractable to figure out an optimal split in cases like that automatically (barring perhaps via symbolic execution, which would be an unreasonable amount of complexity to carry such a niche feature).

from esbuild.

joeljeske avatar joeljeske commented on May 5, 2024

The other way is to pull out the hashes into an import map. That adds a level of indirection between the import paths and the actual hashed file names. It can lead to better caching because changing a dependency doesn't involve also changing the dependents, but import maps aren't a part of the web platform yet so this approach is presumably not viable for a while.

I am very interested in this approach and would like to see it as an option within esbuild. It could be argued that web linking is fundamentally flawed if a content-hash of dependencies appear inside a chunk. If so, then in most applications, minor changes would have cascading file changes.

I currently target SystemJS at runtime and use importmaps to link everything together. I am very interested in using esbuild but this would be a requirement in order to maintain longterm cacheability. Alternatively, esbuild could write out non hashed filename imports in its chunks, and I could generate an importmap using my own content hashes and rename the outputted files.

I am not aware of competition to the importmap spec so I am hoping it will pass through. Additionally, I suspect that any approach taken to support importmap name resolution would be easily swappable to another implementation/format due to the nature of this problem.

from esbuild.

jlfwong avatar jlfwong commented on May 5, 2024

RE: small chunks, they are more cache-optimal, but they also result in more total network bandwidth needed because compression is better for larger chunks because of de-duplication of content between chunks (e.g. len(gzip(a + b)) < len(gzip(a)) + len(gzip(b))). See e.g. https://blog.khanacademy.org/forgo-js-packaging-not-so-fast/

from esbuild.

overlookmotel avatar overlookmotel commented on May 5, 2024

@jlfwong Thanks. That's interesting. Well I did say my understanding was basic!

So a vast number of chunks is a bad idea, but still there's likely to be a "best middle place", which may well be more chunks than are currently generated. That point will differ from site to site depending on, for example, how much of their traffic is repeat visitors, and therefore how much caching comes into play.

I wonder if the two settings I proposed above would be enough to allow people to tune it for their needs, without getting in to manual splitting with it's downsides and overhead for the developer? Or is it too simplistic?

from esbuild.

DanielHeath avatar DanielHeath commented on May 5, 2024

"Create an unused entrypoint with import React from 'react'; window._react = React" is a method of manually adjusting the bundle splitting. Not a great one, admittedly, but I'm not clear there exists a good answer.

from esbuild.

overlookmotel avatar overlookmotel commented on May 5, 2024

@DanielHeath Just to be clear, what I was suggesting is that ESBuild provide an additional option (e.g. splitOn: ['react']) which would internally create these dummy entry points, but not output them.

But yes, for anyone who wants some form of manual chunk splitting right now, this is a workaround to do it.

from esbuild.

jpike88 avatar jpike88 commented on May 5, 2024

My current bundler splits app-specific files into an app.js and all node_modules/external files into a vendor.js

Does anyone else do this and is there a real benefit in it? Feels weird to have esbuild squishing everything into a one big js file.

from esbuild.

retorquere avatar retorquere commented on May 5, 2024

I would if I could. If anything it isolates load failures to a smaller file to do diagnosis on. For the same reasons I'd prefer to be able to split on vendor.js, common.js, and page-specific bundles. Leaves a greater chance that parts will load.

from esbuild.

matthiasg avatar matthiasg commented on May 5, 2024

@arcanis Isn't it possible to write an IIFE entry file around the ESM modules ? Or do you want to target non-green browsers/or the firefox issue mentioned above ?

from esbuild.

ephemer avatar ephemer commented on May 5, 2024

I would like to quickly chime in on this discussion again because it's been a few months since I last wrote. In July I put a bunch of work into this and was able to get code splitting working fine for esm. Unfortunately I don't remember right now exactly what needed changing for this to work in our setup – the main thing was probably changing the input files so they really did use es module syntax by cleanly importing and exporting the needed parts. We have a large legacy codebase written in Meteor and I put a bunch of work into removing the Meteor magic and using real imports.

With that in mind, I would now request instead that the default for target: "browser" change to bundle esm, given that more and more browsers support it. I don't consider it that urgent or important though, and it would be a breaking change. Maybe something to consider for later though.

As for Firefox not supporting esm @jbms, in our setup we have a loader plugin that creates a separate bundle for our worker by calling the esbuild API again from the plugin, just to bundle the worker file. There we use a different set of settings (in our case we do not need splitting at all for our worker bundle). Maybe that workaround is possible for you too.

from esbuild.

jbms avatar jbms commented on May 5, 2024

Thanks for the suggestion. When bundling and not using code splitting, is there any significant difference between esm and iife format? I do want splitting for worker bundles, so unfortunately your suggestion does not help me there.

from esbuild.

retorquere avatar retorquere commented on May 5, 2024

I'm in a similar bind -- I'm writing a Zotero plugin, and esm modules are not supported there.

from esbuild.

pft avatar pft commented on May 5, 2024

Code splitting with dynamic import() of a JSON file that has key names at the root that would not be valid JavaScript identifiers yields incorrect named exports:

File a.json:

{ "x-y": "foo" }

File imp.js:

const getJSON = () => import("./a.json");
getJSON();

Build an esm bundle with splitting:

[user@dom0 ~]$ esbuild --splitting --bundle --format=esm --outdir=app imp.js

Output:

File app/a-ZAVKVQOM.js:

// a.json
var x_y = "foo";
var a_default = { "x-y": x_y };
export {
  a_default as default,
  x_y as "x-y"
};

File app/imp.js:

// imp.js
var getJSON = () => import("./a-ZAVKVQOM.js");
getJSON();

I propose to simply not try and export those fields separately.

By the way, the JSON module proposal does not do named exports at all for JSON files, precisely because of this reason (and because it's conceptually a single thing); this reasoning is at the bottom of that page.

from esbuild.

pft avatar pft commented on May 5, 2024

Thanks for clarifying @evanw, about this new spec and how to deal with it if the intended environment does not support it.

One question though: In dynamic imports, there is no syntax to import stuff like that, or am I missing something?

from esbuild.

mattfysh avatar mattfysh commented on May 5, 2024

I switched to code splitting on dynamic imports but now I've come across a strange bug, has anyone seen this before?

const { parse } = await import('css-what')
parse(selector)

css-what has both ESM and CJS code internally as well as both main and module package.json entries pointing to the relevant file, but for some reason esbuild is using the CJS code instead of ESM. When I edit the package.json and point main to the same place as module, the code starts working again.

I don't think this is a bug with css-what, but I could be wrong

from esbuild.

mattfysh avatar mattfysh commented on May 5, 2024

thanks @hyrious - the fix for my case was to use the --main-fields flag, thanks! One other thing I've noticed is the size of my output directory is much larger, I'm guessing that no tree shaking occurs when using code splitting and dynamic imports?

from esbuild.

eamodio avatar eamodio commented on May 5, 2024

Is there any way to control the code-splitting to only look at async/dynamic imports?

from esbuild.

evanw avatar evanw commented on May 5, 2024

If you bundle each entry point separately, then entry points won’t share any code.

from esbuild.

eamodio avatar eamodio commented on May 5, 2024

If you bundle each entry point separately, then entry points won’t share any code.

Not sure I fully understand that, but I tried setting up separate entry points for a couple of my dynamic import() calls and the main entry point still fully bundled everything, and then it created separate bundles for those imports (but they weren't used).

And if I try to use splitting then I still get WAY too many spit files

from esbuild.

JounQin avatar JounQin commented on May 5, 2024

cjs can also use import(path) inside, so I'm wondering why splitting can only been enabled with esm format, I'm saying that even with cjs format output option, the dynamic chunks can still be esm. Of course, correct extensions (.cjs vs .mjs) should be applied in this case.

@evanw

from esbuild.

mxdvl avatar mxdvl commented on May 5, 2024

If you bundle each entry point separately, then entry points won’t share any code.

And if I try to use splitting then I still get WAY too many spit files

One important caveat is that if you're importing entry points dynamically, you need to make sure that they are marked as external.

For example, if indicate dynamic imports, and your entry points are A, D & C:

A → B → C
D → C
C

You would need to make sure that your onResolve callbacks marks dynamic imports of A, D & C as external.

from esbuild.

Murtatrxx avatar Murtatrxx commented on May 5, 2024

Is this still WIP?

from esbuild.

wiebekaai avatar wiebekaai commented on May 5, 2024

If you bundle each entry point separately, then entry points won’t share any code.

Not sure I fully understand that, but I tried setting up separate entry points for a couple of my dynamic import() calls and the main entry point still fully bundled everything, and then it created separate bundles for those imports (but they weren't used).

And if I try to use splitting then I still get WAY too many spit files

I have the same issue. It splits imports but would only like to split dynamic import(). Any luck?

from esbuild.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.