Giter Site home page Giter Site logo

zeux / meshoptimizer Goto Github PK

View Code? Open in Web Editor NEW
5.4K 108.0 468.0 3.52 MB

Mesh optimization library that makes meshes smaller and faster to render

License: MIT License

Makefile 1.14% C++ 74.82% CMake 0.70% JavaScript 22.81% Python 0.53%
gpu mesh-processing optimization compression simplification gltf

meshoptimizer's Introduction

🐇 meshoptimizer Actions Status codecov.io MIT GitHub

Purpose

When a GPU renders triangle meshes, various stages of the GPU pipeline have to process vertex and index data. The efficiency of these stages depends on the data you feed to them; this library provides algorithms to help optimize meshes for these stages, as well as algorithms to reduce the mesh complexity and storage overhead.

The library provides a C and C++ interface for all algorithms; you can use it from C/C++ or from other languages via FFI (such as P/Invoke). If you want to use this library from Rust, you should use meshopt crate. JavaScript interface for some algorithms is available through meshoptimizer.js.

gltfpack, which is a tool that can automatically optimize glTF files, is developed and distributed alongside the library.

Installing

meshoptimizer is hosted on GitHub; you can download the latest release using git:

git clone -b v0.21 https://github.com/zeux/meshoptimizer.git

Alternatively you can download the .zip archive from GitHub.

The library is also available as a Linux package in several distributions (ArchLinux, Debian, FreeBSD, Nix, Ubuntu), as well as a Vcpkg port (see installation instructions) and a Conan package.

gltfpack is available as a pre-built binary on Releases page or via npm package. Native binaries are recommended since they are more efficient and support texture compression.

Building

meshoptimizer is distributed as a set of C++ source files. To include it into your project, you can use one of the two options:

  • Use CMake to build the library (either as a standalone project or as part of your project)
  • Add source files to your project's build system

The source files are organized in such a way that you don't need to change your build-system settings, and you only need to add the source files for the algorithms you use. They should build without warnings or special compilation options on all major compilers.

Pipeline

When optimizing a mesh, you should typically feed it through a set of optimizations (the order is important!):

  1. Indexing
  2. (optional, discussed last) Simplification
  3. Vertex cache optimization
  4. Overdraw optimization
  5. Vertex fetch optimization
  6. Vertex quantization
  7. (optional) Vertex/index buffer compression

Indexing

Most algorithms in this library assume that a mesh has a vertex buffer and an index buffer. For algorithms to work well and also for GPU to render your mesh efficiently, the vertex buffer has to have no redundant vertices; you can generate an index buffer from an unindexed vertex buffer or reindex an existing (potentially redundant) index buffer as follows:

First, generate a remap table from your existing vertex (and, optionally, index) data:

size_t index_count = face_count * 3;
size_t unindexed_vertex_count = face_count * 3;
std::vector<unsigned int> remap(index_count); // allocate temporary memory for the remap table
size_t vertex_count = meshopt_generateVertexRemap(&remap[0], NULL, index_count, &unindexed_vertices[0], unindexed_vertex_count, sizeof(Vertex));

Note that in this case we only have an unindexed vertex buffer; when input mesh has an index buffer, it will need to be passed to meshopt_generateVertexRemap instead of NULL, along with the correct source vertex count. In either case, the remap table is generated based on binary equivalence of the input vertices, so the resulting mesh will render the same way. Binary equivalence considers all input bytes, including padding which should be zero-initialized if the vertex structure has gaps.

After generating the remap table, you can allocate space for the target vertex buffer (vertex_count elements) and index buffer (index_count elements) and generate them:

meshopt_remapIndexBuffer(indices, NULL, index_count, &remap[0]);
meshopt_remapVertexBuffer(vertices, &unindexed_vertices[0], unindexed_vertex_count, sizeof(Vertex), &remap[0]);

You can then further optimize the resulting buffers by calling the other functions on them in-place.

Vertex cache optimization

When the GPU renders the mesh, it has to run the vertex shader for each vertex; usually GPUs have a built-in fixed size cache that stores the transformed vertices (the result of running the vertex shader), and uses this cache to reduce the number of vertex shader invocations. This cache is usually small, 16-32 vertices, and can have different replacement policies; to use this cache efficiently, you have to reorder your triangles to maximize the locality of reused vertex references like so:

meshopt_optimizeVertexCache(indices, indices, index_count, vertex_count);

Overdraw optimization

After transforming the vertices, GPU sends the triangles for rasterization which results in generating pixels that are usually first ran through the depth test, and pixels that pass it get the pixel shader executed to generate the final color. As pixel shaders get more expensive, it becomes more and more important to reduce overdraw. While in general improving overdraw requires view-dependent operations, this library provides an algorithm to reorder triangles to minimize the overdraw from all directions, which you should run after vertex cache optimization like this:

meshopt_optimizeOverdraw(indices, indices, index_count, &vertices[0].x, vertex_count, sizeof(Vertex), 1.05f);

The overdraw optimizer needs to read vertex positions as a float3 from the vertex; the code snippet above assumes that the vertex stores position as float x, y, z.

When performing the overdraw optimization you have to specify a floating-point threshold parameter. The algorithm tries to maintain a balance between vertex cache efficiency and overdraw; the threshold determines how much the algorithm can compromise the vertex cache hit ratio, with 1.05 meaning that the resulting ratio should be at most 5% worse than before the optimization.

Vertex fetch optimization

After the final triangle order has been established, we still can optimize the vertex buffer for memory efficiency. Before running the vertex shader GPU has to fetch the vertex attributes from the vertex buffer; the fetch is usually backed by a memory cache, and as such optimizing the data for the locality of memory access is important. You can do this by running this code:

meshopt_optimizeVertexFetch(vertices, indices, index_count, vertices, vertex_count, sizeof(Vertex));

This will reorder the vertices in the vertex buffer to try to improve the locality of reference, and rewrite the indices in place to match; if the vertex data is stored using multiple streams, you should use meshopt_optimizeVertexFetchRemap instead. This optimization has to be performed on the final index buffer since the optimal vertex order depends on the triangle order.

Note that the algorithm does not try to model cache replacement precisely and instead just orders vertices in the order of use, which generally produces results that are close to optimal.

Vertex quantization

To optimize memory bandwidth when fetching the vertex data even further, and to reduce the amount of memory required to store the mesh, it is often beneficial to quantize the vertex attributes to smaller types. While this optimization can technically run at any part of the pipeline (and sometimes doing quantization as the first step can improve indexing by merging almost identical vertices), it generally is easier to run this after all other optimizations since some of them require access to float3 positions.

Quantization is usually domain specific; it's common to quantize normals using 3 8-bit integers but you can use higher-precision quantization (for example using 10 bits per component in a 10_10_10_2 format), or a different encoding to use just 2 components. For positions and texture coordinate data the two most common storage formats are half precision floats, and 16-bit normalized integers that encode the position relative to the AABB of the mesh or the UV bounding rectangle.

The number of possible combinations here is very large but this library does provide the building blocks, specifically functions to quantize floating point values to normalized integers, as well as half-precision floats. For example, here's how you can quantize a normal:

unsigned int normal =
    (meshopt_quantizeUnorm(v.nx, 10) << 20) |
    (meshopt_quantizeUnorm(v.ny, 10) << 10) |
     meshopt_quantizeUnorm(v.nz, 10);

and here's how you can quantize a position:

unsigned short px = meshopt_quantizeHalf(v.x);
unsigned short py = meshopt_quantizeHalf(v.y);
unsigned short pz = meshopt_quantizeHalf(v.z);

Since quantized vertex attributes often need to remain in their compact representations for efficient transfer and storage, they are usually dequantized during vertex processing by configuring the GPU vertex input correctly to expect normalized integers or half precision floats, which often needs no or minimal changes to the shader code. When CPU dequantization is required instead, meshopt_dequantizeHalf can be used to convert half precision values back to single precision; for normalized integer formats, the dequantization just requires dividing by 2^N-1 for unorm and 2^(N-1)-1 for snorm variants, for example manually reversing meshopt_quantizeUnorm(v, 10) can be done by dividing by 1023.

Vertex/index buffer compression

In case storage size or transmission bandwidth is of importance, you might want to additionally compress vertex and index data. While several mesh compression libraries, like Google Draco, are available, they typically are designed to maximize the compression ratio at the cost of disturbing the vertex/index order (which makes the meshes inefficient to render on GPU) or decompression performance. They also frequently don't support custom game-ready quantized vertex formats and thus require to re-quantize the data after loading it, introducing extra quantization errors and making decoding slower.

Alternatively you can use general purpose compression libraries like zstd or Oodle to compress vertex/index data - however these compressors aren't designed to exploit redundancies in vertex/index data and as such compression rates can be unsatisfactory.

To that end, this library provides algorithms to "encode" vertex and index data. The result of the encoding is generally significantly smaller than initial data, and remains compressible with general purpose compressors - so you can either store encoded data directly (for modest compression ratios and maximum decoding performance), or further compress it with zstd/Oodle to maximize compression ratio.

Note: this compression scheme is available as a glTF extension EXT_meshopt_compression.

To encode, you need to allocate target buffers (preferably using the worst case bound) and call encoding functions:

std::vector<unsigned char> vbuf(meshopt_encodeVertexBufferBound(vertex_count, sizeof(Vertex)));
vbuf.resize(meshopt_encodeVertexBuffer(&vbuf[0], vbuf.size(), vertices, vertex_count, sizeof(Vertex)));

std::vector<unsigned char> ibuf(meshopt_encodeIndexBufferBound(index_count, vertex_count));
ibuf.resize(meshopt_encodeIndexBuffer(&ibuf[0], ibuf.size(), indices, index_count));

You can then either serialize vbuf/ibuf as is, or compress them further. To decode the data at runtime, call decoding functions:

int resvb = meshopt_decodeVertexBuffer(vertices, vertex_count, sizeof(Vertex), &vbuf[0], vbuf.size());
int resib = meshopt_decodeIndexBuffer(indices, index_count, &ibuf[0], ibuf.size());
assert(resvb == 0 && resib == 0);

Note that vertex encoding assumes that vertex buffer was optimized for vertex fetch, and that vertices are quantized; index encoding assumes that the vertex/index buffers were optimized for vertex cache and vertex fetch. Feeding unoptimized data into the encoders will produce poor compression ratios. Both codecs are lossless - the only lossy step is quantization that happens before encoding.

To reduce the data size further, it's recommended to use meshopt_optimizeVertexCacheStrip instead of meshopt_optimizeVertexCache when optimizing for vertex cache, and to use new index codec version (meshopt_encodeIndexVersion(1)). This trades off some efficiency in vertex transform for smaller vertex and index data.

Decoding functions are heavily optimized and can directly target write-combined memory; you can expect both decoders to run at 1-3 GB/s on modern desktop CPUs. Compression ratios depend on the data; vertex data compression ratio is typically around 2-4x (compared to already quantized data), index data compression ratio is around 5-6x (compared to raw 16-bit index data). General purpose lossless compressors can further improve on these results.

Index buffer codec only supports triangle list topology; when encoding triangle strips or line lists, use meshopt_encodeIndexSequence/meshopt_decodeIndexSequence instead. This codec typically encodes indices into ~1 byte per index, but compressing the results further with a general purpose compressor can improve the results to 1-3 bits per index.

The following guarantees on data compatibility are provided for point releases (no guarantees are given for development branch):

  • Data encoded with older versions of the library can always be decoded with newer versions;
  • Data encoded with newer versions of the library can be decoded with older versions, provided that encoding versions are set correctly; if binary stability of encoded data is important, use meshopt_encodeVertexVersion and meshopt_encodeIndexVersion to 'pin' the data versions.

Due to a very high decoding performance and compatibility with general purpose lossless compressors, the compression is a good fit for the use on the web. To that end, meshoptimizer provides both vertex and index decoders compiled into WebAssembly and wrapped into a module with JavaScript-friendly interface, js/meshopt_decoder.js, that you can use to decode meshes that were encoded offline:

// ready is a Promise that is resolved when (asynchronous) WebAssembly compilation finishes
await MeshoptDecoder.ready;

// decode from *Data (Uint8Array) into *Buffer (Uint8Array)
MeshoptDecoder.decodeVertexBuffer(vertexBuffer, vertexCount, vertexSize, vertexData);
MeshoptDecoder.decodeIndexBuffer(indexBuffer, indexCount, indexSize, indexData);

Usage example is available, with source in demo/index.html; this example uses .GLB files encoded using gltfpack.

Point cloud compression

The vertex encoding algorithms can be used to compress arbitrary streams of attribute data; one other use case besides triangle meshes is point cloud data. Typically point clouds come with position, color and possibly other attributes but don't have an implied point order.

To compress point clouds efficiently, it's recommended to first preprocess the points by sorting them using the spatial sort algorithm:

std::vector<unsigned int> remap(point_count);
meshopt_spatialSortRemap(&remap[0], positions, point_count, sizeof(vec3));

// for each attribute stream
meshopt_remapVertexBuffer(positions, positions, point_count, sizeof(vec3), &remap[0]);

After this the resulting arrays should be quantized (e.g. using 16-bit fixed point numbers for positions and 8-bit color components), and the result can be compressed using meshopt_encodeVertexBuffer as described in the previous section. To decompress, meshopt_decodeVertexBuffer will recover the quantized data that can be used directly or converted back to original floating-point data. The compression ratio depends on the nature of source data, for colored points it's typical to get 35-40 bits per point as a result.

Triangle strip conversion

On most hardware, indexed triangle lists are the most efficient way to drive the GPU. However, in some cases triangle strips might prove beneficial:

  • On some older GPUs, triangle strips may be a bit more efficient to render
  • On extremely memory constrained systems, index buffers for triangle strips could save a bit of memory

This library provides an algorithm for converting a vertex cache optimized triangle list to a triangle strip:

std::vector<unsigned int> strip(meshopt_stripifyBound(index_count));
unsigned int restart_index = ~0u;
size_t strip_size = meshopt_stripify(&strip[0], indices, index_count, vertex_count, restart_index);

Typically you should expect triangle strips to have ~50-60% of indices compared to triangle lists (~1.5-1.8 indices per triangle) and have ~5% worse ACMR. Note that triangle strips can be stitched with or without restart index support. Using restart indices can result in ~10% smaller index buffers, but on some GPUs restart indices may result in decreased performance.

To reduce the triangle strip size further, it's recommended to use meshopt_optimizeVertexCacheStrip instead of meshopt_optimizeVertexCache when optimizing for vertex cache. This trades off some efficiency in vertex transform for smaller index buffers.

Deinterleaved geometry

All of the examples above assume that geometry is represented as a single vertex buffer and a single index buffer. This requires storing all vertex attributes - position, normal, texture coordinate, skinning weights etc. - in a single contiguous struct. However, in some cases using multiple vertex streams may be preferable. In particular, if some passes require only positional data - such as depth pre-pass or shadow map - then it may be beneficial to split it from the rest of the vertex attributes to make sure the bandwidth use during these passes is optimal. On some mobile GPUs a position-only attribute stream also improves efficiency of tiling algorithms.

Most of the functions in this library either only need the index buffer (such as vertex cache optimization) or only need positional information (such as overdraw optimization). However, several tasks require knowledge about all vertex attributes.

For indexing, meshopt_generateVertexRemap assumes that there's just one vertex stream; when multiple vertex streams are used, it's necessary to use meshopt_generateVertexRemapMulti as follows:

meshopt_Stream streams[] = {
    {&unindexed_pos[0], sizeof(float) * 3, sizeof(float) * 3},
    {&unindexed_nrm[0], sizeof(float) * 3, sizeof(float) * 3},
    {&unindexed_uv[0], sizeof(float) * 2, sizeof(float) * 2},
};

std::vector<unsigned int> remap(index_count);
size_t vertex_count = meshopt_generateVertexRemapMulti(&remap[0], NULL, index_count, index_count, streams, sizeof(streams) / sizeof(streams[0]));

After this meshopt_remapVertexBuffer needs to be called once for each vertex stream to produce the correctly reindexed stream.

Instead of calling meshopt_optimizeVertexFetch for reordering vertices in a single vertex buffer for efficiency, calling meshopt_optimizeVertexFetchRemap and then calling meshopt_remapVertexBuffer for each stream again is recommended.

Finally, when compressing vertex data, meshopt_encodeVertexBuffer should be used on each vertex stream separately - this allows the encoder to best utilize corellation between attribute values for different vertices.

Simplification

All algorithms presented so far don't affect visual appearance at all, with the exception of quantization that has minimal controlled impact. However, fundamentally the most effective way at reducing the rendering or transmission cost of a mesh is to make the mesh simpler.

This library provides two simplification algorithms that reduce the number of triangles in the mesh. Given a vertex and an index buffer, they generate a second index buffer that uses existing vertices in the vertex buffer. This index buffer can be used directly for rendering with the original vertex buffer (preferably after vertex cache optimization), or a new compact vertex/index buffer can be generated using meshopt_optimizeVertexFetch that uses the optimal number and order of vertices.

The first simplification algorithm, meshopt_simplify, follows the topology of the original mesh in an attempt to preserve attribute seams, borders and overall appearance. For meshes with inconsistent topology or many seams, such as faceted meshes, it can result in simplifier getting "stuck" and not being able to simplify the mesh fully. Therefore it's critical that identical vertices are "welded" together, that is, the input vertex buffer does not contain duplicates. Additionally, it may be possible to preprocess the index buffer (e.g. with meshopt_generateShadowIndexBuffer) to weld the vertices without taking into account vertex attributes that aren't critical and can be rebuilt later.

float threshold = 0.2f;
size_t target_index_count = size_t(index_count * threshold);
float target_error = 1e-2f;

std::vector<unsigned int> lod(index_count);
float lod_error = 0.f;
lod.resize(meshopt_simplify(&lod[0], indices, index_count, &vertices[0].x, vertex_count, sizeof(Vertex),
    target_index_count, target_error, /* options= */ 0, &lod_error));

Target error is an approximate measure of the deviation from the original mesh using distance normalized to [0..1] range (e.g. 1e-2f means that simplifier will try to maintain the error to be below 1% of the mesh extents). Note that the simplifier attempts to produce the requested number of indices at minimal error, but because of topological restrictions and error limit it is not guaranteed to reach the target index count and can stop earlier.

The second simplification algorithm, meshopt_simplifySloppy, doesn't follow the topology of the original mesh. This means that it doesn't preserve attribute seams or borders, but it can collapse internal details that are too small to matter better because it can merge mesh features that are topologically disjoint but spatially close.

float threshold = 0.2f;
size_t target_index_count = size_t(index_count * threshold);
float target_error = 1e-1f;

std::vector<unsigned int> lod(index_count);
float lod_error = 0.f;
lod.resize(meshopt_simplifySloppy(&lod[0], indices, index_count, &vertices[0].x, vertex_count, sizeof(Vertex),
    target_index_count, target_error, &lod_error));

This algorithm will not stop early due to topology restrictions but can still do so if target index count can't be reached without introducing an error larger than target. It is 5-6x faster than meshopt_simplify when simplification ratio is large, and is able to reach ~20M triangles/sec on a desktop CPU (meshopt_simplify works at ~3M triangles/sec).

When a sequence of LOD meshes is generated that all use the original vertex buffer, care must be taken to order vertices optimally to not penalize mobile GPU architectures that are only capable of transforming a sequential vertex buffer range. It's recommended in this case to first optimize each LOD for vertex cache, then assemble all LODs in one large index buffer starting from the coarsest LOD (the one with fewest triangles), and call meshopt_optimizeVertexFetch on the final large index buffer. This will make sure that coarser LODs require a smaller vertex range and are efficient wrt vertex fetch and transform.

Both algorithms can also return the resulting normalized deviation that can be used to choose the correct level of detail based on screen size or solid angle; the error can be converted to world space by multiplying by the scaling factor returned by meshopt_simplifyScale.

Advanced simplification

The main simplification algorithm, meshopt_simplify, exposes additional options and functions that can be used to control the simplification process in more detail.

For basic customization, a number of options can be passed via options bitmask that adjust the behavior of the simplifier:

  • meshopt_SimplifyLockBorder restricts the simplifier from collapsing edges that are on the border of the mesh. This can be useful for simplifying mesh subsets independently, so that the LODs can be combined without introducing cracks.
  • meshopt_SimplifyErrorAbsolute changes the error metric from relative to absolute both for the input error limit as well as for the resulting error. This can be used instead of meshopt_simplifyScale.
  • meshopt_SimplifySparse improves simplification performance assuming input indices are a sparse subset of the mesh. This can be useful when simplifying small mesh subsets independently, and is intended to be used for meshlet simplification. For consistency, it is recommended to use absolute errors when sparse simplification is desired, as this flag changes the meaning of the relative errors.

While meshopt_simplify is aware of attribute discontinuities by default (and infers them through the supplied index buffer) and tries to preserve them, it can be useful to provide information about attribute values. This allows the simplifier to take attribute error into account which can improve shading (by using vertex normals), texture deformation (by using texture coordinates), and may be necessary to preserve vertex colors when textures are not used in the first place. This can be done by using a variant of the simplification function that takes attribute values and weight factors, meshopt_simplifyWithAttributes:

const float nrm_weight = 0.5f;
const float attr_weights[3] = {nrm_weight, nrm_weight, nrm_weight};

std::vector<unsigned int> lod(index_count);
float lod_error = 0.f;
lod.resize(meshopt_simplifyWithAttributes(&lod[0], indices, index_count, &vertices[0].x, vertex_count, sizeof(Vertex),
    &vertices[0].nx, sizeof(Vertex), attr_weights, 3, /* vertex_lock= */ NULL,
    target_index_count, target_error, /* options= */ 0, &lod_error));

The attributes are passed as a separate buffer (in the example above it's a subset of the same vertex buffer) and should be stored as consecutive floats; attribute weights are used to control the importance of each attribute in the simplification process.

When using meshopt_simplifyWithAttributes, it is also possible to lock certain vertices by providing a vertex_lock array that contains a boolean value for each vertex in the mesh. This can be useful to preserve certain vertices, such as the boundary of the mesh, with more control than meshopt_SimplifyLockBorder option provides.

Simplification currently assumes that the input mesh is using the same material for all triangles. If the mesh uses multiple materials, it is possible to split the mesh into subsets based on the material and simplify each subset independently, using meshopt_SimplifyLockBorder or vertex_lock to preserve material boundaries; however, this limits the collapses and as a result may reduce the resulting quality. An alternative approach is to encode information about the material into the vertex buffer, ensuring that all three vertices referencing the same triangle have the same material ID; this may require duplicating vertices on the boundary between materials. After this, simplification can be performed as usual, and after simplification per-triangle material information can be computed from the vertex material IDs. There is no need to inform the simplifier of the value of the material ID: the implicit boundaries created by duplicating vertices with conflicting material IDs will be preserved automatically.

Mesh shading

Modern GPUs are beginning to deviate from the traditional rasterization model. NVidia GPUs starting from Turing and AMD GPUs starting from RDNA2 provide a new programmable geometry pipeline that, instead of being built around index buffers and vertex shaders, is built around mesh shaders - a new shader type that allows to provide a batch of work to the rasterizer.

Using mesh shaders in context of traditional mesh rendering provides an opportunity to use a variety of optimization techniques, starting from more efficient vertex reuse, using various forms of culling (e.g. cluster frustum or occlusion culling) and in-memory compression to maximize the utilization of GPU hardware. Beyond traditional rendering mesh shaders provide a richer programming model that can synthesize new geometry more efficiently than common alternatives such as geometry shaders. Mesh shading can be accessed via Vulkan or Direct3D 12 APIs; please refer to Introduction to Turing Mesh Shaders and Mesh Shaders and Amplification Shaders: Reinventing the Geometry Pipeline for additional information.

To use mesh shaders for conventional rendering efficiently, geometry needs to be converted into a series of meshlets; each meshlet represents a small subset of the original mesh and comes with a small set of vertices and a separate micro-index buffer that references vertices in the meshlet. This information can be directly fed to the rasterizer from the mesh shader. This library provides algorithms to create meshlet data for a mesh, and - assuming geometry is static - can compute bounding information that can be used to perform cluster culling, a technique that can reject a meshlet if it's invisible on screen.

To generate meshlet data, this library provides two algorithms - meshopt_buildMeshletsScan, which creates the meshlet data using a vertex cache-optimized index buffer as a starting point by greedily aggregating consecutive triangles until they go over the meshlet limits, and meshopt_buildMeshlets, which doesn't depend on any other algorithms and tries to balance topological efficiency (by maximizing vertex reuse inside meshlets) with culling efficiency (by minimizing meshlet radius and triangle direction divergence). meshopt_buildMeshlets is recommended in cases when the resulting meshlet data will be used in cluster culling algorithms.

const size_t max_vertices = 64;
const size_t max_triangles = 124;
const float cone_weight = 0.0f;

size_t max_meshlets = meshopt_buildMeshletsBound(indices.size(), max_vertices, max_triangles);
std::vector<meshopt_Meshlet> meshlets(max_meshlets);
std::vector<unsigned int> meshlet_vertices(max_meshlets * max_vertices);
std::vector<unsigned char> meshlet_triangles(max_meshlets * max_triangles * 3);

size_t meshlet_count = meshopt_buildMeshlets(meshlets.data(), meshlet_vertices.data(), meshlet_triangles.data(), indices.data(),
    indices.size(), &vertices[0].x, vertices.size(), sizeof(Vertex), max_vertices, max_triangles, cone_weight);

To generate the meshlet data, max_vertices and max_triangles need to be set within limits supported by the hardware; for NVidia the values of 64 and 124 are recommended (max_triangles must be divisible by 4 so 124 is the value closest to official NVidia's recommended 126). cone_weight should be left as 0 if cluster cone culling is not used, and set to a value between 0 and 1 to balance cone culling efficiency with other forms of culling like frustum or occlusion culling.

Each resulting meshlet refers to a portion of meshlet_vertices and meshlet_triangles arrays; this data can be uploaded to GPU and used directly after trimming:

const meshopt_Meshlet& last = meshlets[meshlet_count - 1];

meshlet_vertices.resize(last.vertex_offset + last.vertex_count);
meshlet_triangles.resize(last.triangle_offset + ((last.triangle_count * 3 + 3) & ~3));
meshlets.resize(meshlet_count);

However depending on the application other strategies of storing the data can be useful; for example, meshlet_vertices serves as indices into the original vertex buffer but it might be worthwhile to generate a mini vertex buffer for each meshlet to remove the extra indirection when accessing vertex data, or it might be desirable to compress vertex data as vertices in each meshlet are likely to be very spatially coherent.

For optimal performance, it is recommended to further optimize each meshlet in isolation for better triangle and vertex locality by calling meshopt_optimizeMeshlet on vertex and index data like so:

meshopt_optimizeMeshlet(&meshlet_vertices[m.vertex_offset], &meshlet_triangles[m.triangle_offset], m.triangle_count, m.vertex_count);

After generating the meshlet data, it's also possible to generate extra data for each meshlet that can be saved and used at runtime to perform cluster culling, where each meshlet can be discarded if it's guaranteed to be invisible. To generate the data, meshlet_computeMeshletBounds can be used:

meshopt_Bounds bounds = meshopt_computeMeshletBounds(&meshlet_vertices[m.vertex_offset], &meshlet_triangles[m.triangle_offset],
    m.triangle_count, &vertices[0].x, vertices.size(), sizeof(Vertex));

The resulting bounds values can be used to perform frustum or occlusion culling using the bounding sphere, or cone culling using the cone axis/angle (which will reject the entire meshlet if all triangles are guaranteed to be back-facing from the camera point of view):

if (dot(normalize(cone_apex - camera_position), cone_axis) >= cone_cutoff) reject();

Efficiency analyzers

While the only way to get precise performance data is to measure performance on the target GPU, it can be valuable to measure the impact of these optimization in a GPU-independent manner. To this end, the library provides analyzers for all three major optimization routines. For each optimization there is a corresponding analyze function, like meshopt_analyzeOverdraw, that returns a struct with statistics.

meshopt_analyzeVertexCache returns vertex cache statistics. The common metric to use is ACMR - average cache miss ratio, which is the ratio of the total number of vertex invocations to the triangle count. The worst-case ACMR is 3 (GPU has to process 3 vertices for each triangle); on regular grids the optimal ACMR approaches 0.5. On real meshes it usually is in [0.5..1.5] range depending on the amount of vertex splits. One other useful metric is ATVR - average transformed vertex ratio - which represents the ratio of vertex shader invocations to the total vertices, and has the best case of 1.0 regardless of mesh topology (each vertex is transformed once).

meshopt_analyzeVertexFetch returns vertex fetch statistics. The main metric it uses is overfetch - the ratio between the number of bytes read from the vertex buffer to the total number of bytes in the vertex buffer. Assuming non-redundant vertex buffers, the best case is 1.0 - each byte is fetched once.

meshopt_analyzeOverdraw returns overdraw statistics. The main metric it uses is overdraw - the ratio between the number of pixel shader invocations to the total number of covered pixels, as measured from several different orthographic cameras. The best case for overdraw is 1.0 - each pixel is shaded once.

Note that all analyzers use approximate models for the relevant GPU units, so the numbers you will get as the result are only a rough approximation of the actual performance.

Memory management

Many algorithms allocate temporary memory to store intermediate results or accelerate processing. The amount of memory allocated is a function of various input parameters such as vertex count and index count. By default memory is allocated using operator new and operator delete; if these operators are overloaded by the application, the overloads will be used instead. Alternatively it's possible to specify custom allocation/deallocation functions using meshopt_setAllocator, e.g.

meshopt_setAllocator(malloc, free);

Note that the library expects the allocation function to either throw in case of out-of-memory (in which case the exception will propagate to the caller) or abort, so technically the use of malloc above isn't safe. If you want to handle out-of-memory errors without using C++ exceptions, you can use setjmp/longjmp instead.

Vertex and index decoders (meshopt_decodeVertexBuffer, meshopt_decodeIndexBuffer, meshopt_decodeIndexSequence) do not allocate memory and work completely within the buffer space provided via arguments.

All functions have bounded stack usage that does not exceed 32 KB for any algorithms.

License

This library is available to anybody free of charge, under the terms of MIT License (see LICENSE.md).

meshoptimizer's People

Contributors

1d10t avatar abma avatar abrown avatar adamkorcz avatar atteneder avatar corporateshark avatar daemyung avatar davidkorczynski avatar donmccurdy avatar dragonjoker avatar dtrebilco avatar gwihlidal avatar jarred-sumner avatar jolifantobambla avatar jonek52 avatar kevinmoran avatar kuranes avatar l0calh05t avatar light7734 avatar lilywangll avatar lorenroosendaal avatar marzer avatar mosra avatar oisyn avatar rafern avatar roehling avatar ruby0x1 avatar sciecode avatar thereclif avatar zeux avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

meshoptimizer's Issues

Stuggle building with CMake

Hi,
I would like to use gltfpack to optimize animated meshes for a web game. The issue is... I'm a web dev and never used cmake, I don't quite understand what I'm supposed to do with it to use gltfpack. Your readme states :

On Windows (and other platforms), you can use CMake:

cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_TOOLS=ON
cmake --build . --config Release --target gltfpack

I tried to install CMake but got not CLI, only a GUI, and I have not idea where I'm supposed to run these commands.
Could you lead me to some online resources explaining how to use CMake in this context please ?

Add -kn flag to readme?

This perfectly solved the issue the issue of trying to reference buildings by name in a scene but was only able to find it by searching closed issues. Perhaps worth adding to readme for future users?

Segfault gltfpack linux 64bit fedora 30

master cc463d7

./gltfpack -v -i /home/arpu/Work/projects/vrland_assetssrc/models/bentley/bentley.glb -o bent.glb                                                                                                                                    ✔  10158  15:57:25 
input: 174 nodes, 170 meshes, 0 skins, 0 animations
meshes: 1332023 triangles, 1086730 vertices
[1]    3449 segmentation fault (core dumped)  ./gltfpack -v -i  -o bent.glb

i think i need debug builds and do some gdb

Simplification runs, but the result has many defects/issues

I am using meshoptimizer's simplification methods to simplify 3D scans w/ have been triangulated using marching cubes. The meshes are far more dense than they need to be, so I am attempting to simplify them down to 20% of their original density.

Unfortunately the result is unusable and full of holes and other defects. Is this normal?

Here is an example:

Input Mesh

image

Simplified Result

image

I am using MeshLab to visualize the result. The meshes are exported to STL format.

The mesh format I am using internally in my software is that of the VCG library, but I think I am appropriately converting the data such that meshoptimizer's simplification methods can operate correctly.

Here is how I am converting the data:

const MeshVert* __restrict root = input.vert.data();
std::vector<unsigned> remap;
std::vector<vcg::Point3f> vertices;

for (const auto& v : input.vert) 
    vertices.emplace_back(v.P());

for (const auto& f : input.face) {
    remap.emplace_back(unsigned((MeshVert*)(f.cV(0)) - (MeshVert*)root));
    remap.emplace_back(unsigned((MeshVert*)(f.cV(1)) - (MeshVert*)root));
    remap.emplace_back(unsigned((MeshVert*)(f.cV(2)) - (MeshVert*)root));
}

const size_t index_count = remap.size();
std::vector<unsigned> simplified(index_count);
size_t target_count = simplification_factor * index_count;
float target_error = 1e-3f;

if (restricted)
    simplified.resize(meshopt_simplify(&simplified[0], remap.data(), index_count, (const float* __restrict)&vertices[0], vertices.size(), sizeof(vcg::Point3f), target_count, target_error));
else
    simplified.resize(meshopt_simplifySloppy(&simplified[0], remap.data(), index_count, (const float* __restrict)&vertices[0], vertices.size(), sizeof(vcg::Point3f), target_count));

I'm a little confused by the concept of "Index" in this mesh processing library. Are the indexes here referring to triplets of offsets from Vertex_Ptr* + 0 which represent each face/triangle?

The VCG mesh has a vector of vertices, each represented by a Point3f type (the internal data of which is 3 floats - X, Y, and Z). The faces contain 3 vertex pointers. I compute the offset/index from these pointers by subtracting the start of the vertex array (root).

Is this the correct approach? Am I doing something wrong? The result looks reasonable, but not entirely correct. Perhaps this is just the nature of decimation?

Object space mesh error from meshopt_simplify()?

I kinda calculate an approximate error metric roughly based on hausdorff distance between the 2 mesh after generating the simplified one. Is there any chance to avoid this calculation and use some consistent object space error metric used internally by meshopt_simplify()?

Thanks in advance.
BTW very handy library. much appreciated.

gltfpack: Validation errors (MESH_PRIMITIVE_ATTRIBUTES_ACCESSOR_INVALID_FORMAT)

Hi

The optimized files created via gltfpack.exe seems to contain some incorrect attributes and cannot be opened by some gltf importer (e.g. the Windows 10 build in 3D viewer doesn't open the files).

The gltf viewer on https://gltf-viewer.donmccurdy.com/ is able to open the files but returns following errors

MESH_PRIMITIVE_ATTRIBUTES_ACCESSOR_INVALID_FORMAT | Invalid accessor format '{VEC3, BYTE normalized}' for this attribute semantic. Must be one of ('{VEC3, FLOAT}'). | /meshes/0/primitives/0/attributes/NORMAL
-- | -- | --
MESH_PRIMITIVE_ATTRIBUTES_ACCESSOR_INVALID_FORMAT | Invalid accessor format '{VEC3, UNSIGNED_SHORT}' for this attribute semantic. Must be one of ('{VEC3, FLOAT}'). | /meshes/0/primitives/0/attributes/POSITION
MESH_PRIMITIVE_ATTRIBUTES_ACCESSOR_INVALID_FORMAT | Invalid accessor format '{VEC2, UNSIGNED_SHORT}' for this attribute semantic. Must be one of ('{VEC2, FLOAT}', '{VEC2, UNSIGNED_BYTE normalized}', '{VEC2, UNSIGNED_SHORT normalized}'). | /meshes/0/primitives/0/attributes/TEXCOORD_0
ACCESSOR_NON_UNIT | 2368 accessor elements not of unit length: 0. [AGGREGATED] | /accessors/0

Using templated parameters for indices and vertex positions

If templated parameters were used for indices (common types include uint8_t, uint16_t, uint32_t, unsigned short, and unsigned int) and vertex positions (float, double), the library would be more flexible with regard to the potential needs of users.

gltfpack: Error running through large gltf file

Running gltfpack from npm.
gltfpack -i sourcefile.gltf -o output\destination.gltf

Error:

exception thrown: RuntimeError: memory access out of bounds,RuntimeError: memory access out of bounds
    at wasm-function[30]:0xaed
    at wasm-function[52]:0x1acd
    at wasm-function[845]:0x238d1
    at wasm-function[847]:0x23d08
    at wasm-function[850]:0x24121
    at wasm-function[410]:0xd47a
    at wasm-function[786]:0x20e32
    at Module._main (\Roaming\npm\node_modules\gltfpack\bin\gltfpack.js:2:84300)
    at callMain (\Roaming\npm\node_modules\gltfpack\bin\gltfpack.js:2:85315)
    at doRun (\Roaming\npm\node_modules\gltfpack\bin\gltfpack.js:2:85893)

Input model is a large gltf file (440 MB gltf + 502MB bin).

optimizeVertexCache results

Thanks for this library ( offtopic: also thanks for the roblox graphics api stats)

I am replacing Forsyth's optimizeFaces with optimizeVertexCache.

I have tested on models from https://casual-effects.com/data/.
On most of the models the acmr is the same or better with optimizeVertexCache.

But for some models like Sports Car or Road Bike the acmr is slightly larger with optimizeVertexCache. Difference is 0.01-0.04

I just wanted to ask if this is a known difference? I have used meshopt_analyzeVertexCache for the test.

Any plans to integrate with Blender and/or FreeCAD?

Looks like there is potential for integration with some widespread software offering mesh optimization functionality, but without all the cool tricks.

Have you considered contacting them in this matter?

[gltfpacker] Feature request: add support for unlit materials (KHR_materials_unlit extension)

Hi,

In order to test the models from the glTF-Sample-Models more easily I've made a small set of scripts to download the latest models and run them through gltfpacker: https://github.com/TimvanScherpenzeel/gltfpacker-sample-model-test. It is very much hacked together but it does the job, perhaps parts of it can also be useful to you too.

I haven't spotted any bugs or unhandled cases except for a model with unlit materials (using the KHR_materials_unlit extension). I think the objective of the extension is in line with the goals of gltfpacker.

Original
Screen Shot 2019-07-03 at 21 53 07

gltfpacker
Screen Shot 2019-07-03 at 21 46 36

UnlitTest.zip

Kind regards,

Tim

simplifySloppy aggressively welds vertices causing issues for attribute discontinuities

When I use the aggressive option in gltfpack I find that the model shading looks off. This appears to be caused by the normal being a bit messy. Attached is a screen shot from the babylon.js viewer with normal preview turned on for one of the meshes (hence the beautiful rainbow).

Screen Shot 2019-10-24 at 3 00 30 PM

I'm not sure if this is an unavoidable artifact, or something that can be improved. Also, I realize the input model is already pretty sparse. Nonetheless, the vertex removals seem reasonable, if only the normals were better preserved.

Here's an example of the command I'm using. I've also attached the input and output GLBs I'm using.

gltfpack -i /tmp/input.glb -o /tmp/output.glb -si 0.8 -sa -v

Archive.zip

gltfpack: option to disable mesh consolidation?

I want to get the "goodness" of gltfpack while still allowing a web application to access some individual meshes that I've grouped for accessing individually in a web app.

Perhaps I'm doing something that should be done through separate gltf files, in other words maybe it doesn't make sense to have this option since it's a core feature of gltfpack?

I can provide code example if helpful but at least wanted to start the thread to see if this makes sense as an issue.

I'm using this component for A-Frame to access individual parts of the gltf:
https://github.com/supermedium/superframe/tree/master/components/gltf-part/

Versions:
A-Frame Version: 1.0.4 (Date 2020-05-07, Commit #9022b97e)
THREE Version (https://github.com/supermedium/three.js): ^0.115.1
installed gltfpack with npm install -g gltfpack, I think it's using commit hash 9e89bf3

Questions: Material Index and UVs

Hi,

In my model, there can be two adjacent-co-planar triangles with different material IDs. They should not be merged. How to handle this case?

Moreover, I have UVs defined at Triangle Vertex. I mean, for a given Cube model, I have,
Vertices: 24
Vertex Normals: 24
Indices: 36
UVs: 36 (Every triangle will have 3 UVs; (one for each vertex in triangle)

Now I could not find a way to set UVs in the simplified mesh from the original mesh.
Any clue?

Thanks in advance.

Some metrics

Hi Arseny,

first of all, awesome project, really cool.

Second, sorry if I opened an issue, but this is something more like a request for any info regarding some performance metrics.

I'm developing a jvm port of Assimp here and I should implement some state-of-the-art post-process techniques to improve the time required to render them.

Before ending up on your repo, I was actually looking for the best method to improve vertex caching.
And I guess it's this one: "An Improved Vertex Caching Scheme for 3D Mesh Rendering". Because it's even better than Amd Tootle by a 10% (although it requires ~100x time)

Anyway, as I already said, I'd be interested to know if you have ever taken any metrics about your mesh optimizer. Or if you have some considerations/feedbacks between implementing simply a lone vertex caching optimization and a full-pipeline optimization like your project.

Thanks in advance

Is uneven decimation normal on some models ?

Hi, I integrated your optimizer for generating collision meshes by using meshopt_simplify followed by meshopt_optimizeVertexFetch. Unfortunately, on some meshes, the decimated result has some parts aggressively simplified, and some other parts have redundants triangles conserved; vertices have no attributes other than their position.
Does it come from the topology of the mesh ? My settings ?
The most noticeable come from an airport terminal, as you can see here; some parts are really reduced while others seem to be ignored.
The target error were 0.001, then 0.1 and the vertices conserved were respectively 64.5% and 63%.

Models as used (only positions for vertex attribute), exported as .OBJ:
Original Decimated

Consider adding a --version flag

Before commenting on #88 I'd wanted to make sure I had the latest version of gltfpack, and wasn't able to tell from inspecting the executable or its CLI output. A gltfpack --version flag would be helpful for that purpose. Thanks!

Different vertex count between tiny obj loader and fast obj loader

Good afternoon, hope you are having a nice xmas break.

I have been using fast obj for my vulkan code and worked quite well. Now I need to merge back into my main engine, where I have a mesh compiler which uses tiny obj loader, when I ported the code to fast obj I started noticing this issue, in loading the file I am getting an extra vertex from fast obj.
Tiny obj: 3792
Fast obj: 3793

Maya reports 3792:
image

This is an issue on my end because I am exporting extra data from Maya, like tangents and skinning data, of which I use the same obj indexing to fetch. Having the extra array I get an out of bound assertion.

Here below (and attached) a sample code to reproduce:

#define TINYOBJLOADER_IMPLEMENTATION
#include "tiny_obj_loader.h"

#ifndef _CRT_SECURE_NO_WARNINGS
#define _CRT_SECURE_NO_WARNINGS
#endif
#define FAST_OBJ_IMPLEMENTATION
#include "fast_obj.h"

#include <iostream>

const char *PATH = "test.obj";

int main() {
  // loading the obj using tiny
  tinyobj::attrib_t attr;
  std::vector<tinyobj::shape_t> shapes;
  std::vector<tinyobj::material_t> materials;

  std::string warn;
  std::string err;
  bool ret = tinyobj::LoadObj(&attr, &shapes, &materials, &warn, &err, PATH);
  if (!ret) {
    printf("Error loading %s: file not found\n", PATH);
    return 0;
  }

  //loading using fast obj
  fastObjMesh *obj = fast_obj_read(PATH);
  if (!obj) {
    printf("Error loading %s: file not found\n", PATH);
    return false;
  }

  //now compare
  printf ("Tiny vertex count %i \n" ,static_cast<int>(attr.vertices.size()/3));
  printf ("Fast vertex count %i" ,obj->position_count);
}

Here the output:
image

Am I doing anything wrong? I have thought on what the issue could be and did not find a solution yet. I have also tried per-triangulating the mesh before export, and got the same result.
If anything else is needed on my side please let me know.

The main reasons I am switching to fast obj is:

  1. hopefully faster debug and release performance then tiny.
  2. Since I am about to use mesh optimizer anyway I can just use fast obj and have one dependency, not two.

Best regards

M.

solution/code/model link: https://1drv.ms/u/s!Ai0n7iKmKMz0gthD0wT4C3KUKbW5Pw?e=eKo2ZN

missing texture reference when using gltf pack

Hi, for models that have textures, after I run gltf pack and attempt to load the model im getting missing texture reference errors
eg: gltfpack -i Sponza.gltf -o scenepacked.glb

from this dataset:
https://github.com/KhronosGroup/glTF-Sample-Models/tree/master/2.0/Sponza/glTF

if I then try to view my packed .glb into something like the gltf viewer
(https://gltf-viewer.donmccurdy.com/)

I get this:

IO_ERROR Node Exception: TypeError: Failed to fetch /images/0/uri
IO_ERROR Node Exception: TypeError: Failed to fetch /images/1/uri
IO_ERROR Node Exception: TypeError: Failed to fetch /images/2/uri
...etc

I've also tried using Babylonjs loader , and im getting missing texture errors.
If i try to pack a gltf that does not contain textures to begin with, everything works.

Please advice.

[gltfpack] Issue with the displacement of position in animated skinned meshes

Hi Arseny,

Thank you for all your hard work on gltfpack!

I've been testing various glTF models from the glTF-Sample-Models repo and came across a few that have issues after packing with gltfpack with default settings.

  • BrainStem displaces the position of the mesh from the original position. Animations themselves appear to work correctly.

Correct
Screen Shot 2019-06-27 at 19 59 14

Incorrect
Screen Shot 2019-06-27 at 19 59 29

  • CesiumMan has the same issue, displaces the position of the mesh from the original position. Animations themselves appear to work correctly.

Correct
Screen Shot 2019-06-27 at 20 03 45

Incorrect
Screen Shot 2019-06-27 at 20 03 16

  • AnimatedCube does not spin anymore after conversion. There is an animation track but it does not appear to play correctly.

Incorrect
Screen Shot 2019-06-27 at 19 59 49

  • BoxAnimated does not animate anymore after conversion. There is an animation track but it does not appear to play correctly.

  • TriangleWithoutIndices does not render correctly, the viewer throws the following error: Cannot read property 'updateMatrixWorld' of undefined.

I've created a fork of three-gltf-viewer that uses the latest versions of lib/GLTFLoader.js and js/meshopt_decoder.js from this repo (using raw.githack.com) directly to test the implementation more easily, perhaps it can also be of use to you during development.

Kind regards,

Tim

simplifySloppy overestimating actual triangle count

Hi, I'm working on a project which includes automatically simplifying arbitrary meshes to various triangle counts. I've been using simplifySloppy because it seems more reliable for various poor quality input meshes, and being able to achieve very low triangle counts for low LODs.

I found that when reducing a highly detailed mesh to very low triangle counts, the grid size being used sometimes ended up being bigger than it needed to be - this is because countTriangles() wasn't taking into account that some triangles become duplicates and then get eliminated during filterTriangles(), after quantization.

I've made a fix for it here: virtalis@62ab34f. Would you like me to open a pull request, or was it left this way intentionally for speed? It now requires creating a hash table of triangles during each loop iteration at the start of the algorithm, which will obviously slow it down a bit compared to the simple sum it was doing before.

--

I've also made some changes to prevent vertices being collapsed when their normals are drastically different, by expanding the grid into an additional 3 dimensions for normal space: https://github.com/virtalis/meshoptimizer/commits/simplifySloppy-normals. I'm not sure if these are changes you'd be interested in, but I can open another issue to discuss them if you are.

Attribute-aware error metrics for simplification

Hi! I'm playing with https://developer.nvidia.com/orca/amazon-lumberyard-bistro dataset and meshoptimizer and I've noticed this particular failcase related to the way the corners were authored.

For example, here is the original chair mesh:
image

Notice the rounded corners with shared vertices. They survive the first pass of meshopt_simplify to half the number of triangles fine:
image

However, once the triangles between two sides facing at 90deg get folded and the sides start sharing the vertices, the vertex normals can no longer be correct:
image

What would be a solution to (automatically) preventing this?

I was thinking of adding additional custom skip code in the 'pickEdgeCollapses' loop if angle between vertex normals is above certain threshold but I'm sure there's a better/simpler solution, perhaps already there? :)

(instead of preventing collapse, could also allow it but duplicate verts so normals aren't shared?)

Thanks for the great library!!

gltfpack: Significant UV distortion for models with high tiling factors

I think I'm seeing a similar issue in v0.13 of gltfpack. Testing against the model here: https://sketchfab.com/3d-models/melodia-city-hotel-a2fb8e4065ce470296d6d801daa37f18, with only default gltfpack options used:

before after
before after

annotated diff:

diff

^Much of that diff is in reflections, which probably relates to merging meshes with transparency, and that seems fine. But the highlighted areas on the side of the building and the palm trees in front of the building show noticeable UV shifts that might suggest an issue:

Screen Shot 2019-12-24 at 2 02 23 PM

how to set BASISU_PATH?

Error: basisu is not present in PATH or BASISU_PATH is not set

this is the folder structure
Screenshot 2020-04-22 at 7 47 16 PM

this is the env file
Screenshot 2020-04-22 at 7 47 58 PM

what am I missing??? it keeps saying set PATH

gltfpack: preserve material extras

Currently, gltfpack discards any "extras" attached to materials when it merges materials. It would be nice if we could optionally preserve the extras when merging and only pass the areMaterialsEqual test if they are equal.

simplifySloppy doesn't have a way to specify the target error

Hi,

I'm playing a bit with your great library in order to decimate some over-detailled 3D models, before using the result as basis for a collision mesh.

The input I have is therefore basically a list of vertex positions, and a list of indices (not other vertex attributes).

I'm first running meshopt_generateShadowIndexBuffer() in order to have all vertices with a similar position to be collapsed.

Basically, my issue is that I haven't managed to get a result that's simplified enough using meshopt_simplify(). The initial mesh has 345k triangles, the output has 285k triangles, whatever parameters I've tried.

I've set the target indices count to something very low (50) and even tried to increase the error to its maximal value (1.0f), however, that doesn't make a big difference.

Using meshopt_simplifySloppy(), results are much better (basically reaching the indices target count), however, I'm want to use an error threshold rather than a target indices count (since I don't want the artists/user to have to enter a target indices count, moreover, the input meshes are really variable, some are too detailled, some are not, so an error threshold makes much more sense to me).

So, I'm stuck with meshopt_simplify() I think, appart in the case you have an idea how to use meshopt_simplifySloppy() with an error threshold as input ?

meshopt_simplify:
simplify

meshopt_simplifySloppy:
simplifysloppy

What puzzles me a bit is that meshopt_simplify() doesn't modify the topology, despite having the error threshold set to 1.0f and the target indices count set to 10. You can clearly see that when looking at the windows of the building in the foreground. Despite the significant threshold, the windows are not simplified.

Is that normal ? Can I improve that ? Or is there a way to use meshopt_simplifySloppy() with an error threshold as input, rather than a target indices count ?

Thanks a lot for your help !!

gltfpack does not preserve node hierarchy

Is there some option to keep the "body" nodes in the image below:
image

because running gltfpack without parameters results in a flat node list:

image

this makes it impossible to move the bodies

Consider MESHOPT_-prefixed extensions

As the KHR_quantized_geometry and KHR_meshopt_compression extensions are pseudo-extensions at this point, consider reserving the MESHOPT_ prefix (or similar) in the glTF repository and using that for now. Details are on GitHub, but as a short summary:

  • Vendor extensions like this are recommended for single-party needs and initial proposals.
  • EXT_ extensions are recommended for multi-party extensions that are not, for whatever reason, ratified by Khronos or protected by the Khronos IP framework.
  • KHR_ extensions are ratified by Khronos, and protected by its IP framework.

I like the idea of exploring official versions of both, but that would take some time, and best to prevent any confusion in meanwhile. Thanks! 🙂

Preserve extras for nodes

Hi, this is the great tool, but I face some problems:

Input gltf file has some extras in nodes:

"nodes": [
    {
      "name": "parent",
      "extras": {
        "xxx": 555
      },
      "children": [1]
    },
    {
      "name": "mesh",
      "mesh": 0,
      "extras": {
        "yyy": 666
      }
    }
  ]

It would be nice to preserve extras not only for materials, but for nodes too:

gltfpack-0.14-windows> .\gltfpack.exe -i test.gltf -o test_out.gltf -v -kn -ke -noq

"nodes": [
    { "mesh": 0 },
    { "name": "parent", "children": [2] },
    { "name": "mesh", "children": [0] }
  ]

Add simplify option to gltfpack

Thank you for this super useful library! I would like to further optimize models by running simplification. It would be really helpful if gltfpack allowed me to do this one shot.

For example, -si 0.4 could reduce the poly count to 40% of the original.

./gltfpack -i glTF-Sample-Models/2.0/BarramundiFish/glTF/BarramundiFish.gltf -o ./glTF/BarramundiFish-simple.gltf -si 0.4

Using overdraw optimization by itself

It would be nice to be able to use the overdraw optimizer by itself both for testing purposes but also because I am using unusual geometry with many concavities, which means that the pixel processing far outweighs the cost of vertex transformation.

Unprefixed CMake options polluting global namespace

MeshOptimizer uses generic unprefixed names for its CMake options. For example:

option(BUILD_DEMO "Build demo" OFF)
option(BUILD_TOOLS "Build tools" OFF)
option(BUILD_SHARED_LIBS "Build shared libraries" OFF)

There's no problem if MeshOptimizer is compiled as a standalone project.

However, when MeshOptimizer is compiled as a part of a complex CMake build process, these option names do pollute the global namespace.

Popular libraries use prefixes for their CMake options. For example, Assimp uses the ASSIMP_ prefix (ASSIMP_BUILD_ASSIMP_TOOLS, ASSIMP_BUILD_ASSIMP_TESTS, etc), and GLFW uses the GLFW_ prefix (GLFW_BUILD_EXAMPLES, GLFW_BUILD_DOCS, etc).

Similar approach should be adopted for MeshOptimizer.

gltfpack: model origin shifted after optimization

Hey, first of all, thank you so much for your work on this tool. The filesize results are really looking great right now.

My use case is as follow : I'm trying to compress models I'll be using in a real time three.js scene. I use blender to correct my models and set the gltf file up. I use the CLI gltfpack to optimize my filesize.

When I use the .gltf file straight out of blender in my ThreeJS scene, the origin I set in Blender is respected. However, after having it go through gltfpack, the origin is shifted, and looks like it's somewhere in the upperleft corner relative to the model. This happens with no options, -c option, and -cc option. Basically as soon as it goes through gltfpack.

I'm sorry I don't know more, but it might be a bug with how origins are handled? This is also happening with every model I try this on. If this is user error, I'd be more than happy to learn from my mistakes!

You can find the original GLTF file with correct origin + the one that was converted by gltfpack in that zip: https://we.tl/t-4H8zlF8exe

Thank you!

Encode/decode vertex deinterleaved

Hi @zeux,

Maybe I'm getting it wrong, but if I want to encode/decode deinterleaved vertex arrays it seems that I'm forced to use always multiple of 4 bytes per vertex attribute:

https://github.com/zeux/meshoptimizer/blob/master/src/vertexcodec.cpp#L1070

For instance, it means that I would need to use floats instead of shorts for quantized positions. That probably is not an issue when encoding, as general compressors are going to deflate this unused space for each float.

The problem comes for me when decoding, I usually upload quantized attributes to the graphics card and dequantize them in the shader (normals, positions, etc). If I use meshoptimizer decode, will l be spending extra memory in the GPU if I upload the buffer directly from decode? I guess yes. In that case I would need to think if it's better to spend the extra time on copying arrays or spend the extra memory in GPU.

Is what I said correct? Any advice on the matter?

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.