Giter Site home page Giter Site logo

fishkerez / mimetype Goto Github PK

View Code? Open in Web Editor NEW

This project forked from gabriel-vasile/mimetype

0.0 0.0 0.0 23.81 MB

A fast Golang library for media type and file extension detection, based on magic numbers

Home Page: https://pkg.go.dev/github.com/gabriel-vasile/mimetype#pkg-overview

License: MIT License

Go 100.00%

mimetype's Introduction

mimetype

A package for detecting MIME types and extensions based on magic numbers

Goroutine safe, extensible, no C bindings

Go Reference Go report card Code coverage License

Features

Install

go get github.com/gabriel-vasile/mimetype

Usage

mtype := mimetype.Detect([]byte)
// OR
mtype, err := mimetype.DetectReader(io.Reader)
// OR
mtype, err := mimetype.DetectFile("/path/to/file")
fmt.Println(mtype.String(), mtype.Extension())

See the runnable Go Playground examples.

Usage'

Only use libraries like mimetype as a last resort. Content type detection using magic numbers is slow, inaccurate, and non-standard. Most of the times protocols have methods for specifying such metadata; e.g., Content-Type header in HTTP and SMTP.

FAQ

Q: My file is in the list of supported MIME types but it is not correctly detected. What should I do?

A: Some file formats (often Microsoft Office documents) keep their signatures towards the end of the file. Try increasing the number of bytes used for detection with:

mimetype.SetLimit(1024*1024) // Set limit to 1MB.
// or
mimetype.SetLimit(0) // No limit, whole file content used.
mimetype.DetectFile("file.doc")

If increasing the limit does not help, please open an issue.

Structure

mimetype uses a hierarchical structure to keep the MIME type detection logic. This reduces the number of calls needed for detecting the file type. The reason behind this choice is that there are file formats used as containers for other file formats. For example, Microsoft Office files are just zip archives, containing specific metadata files. Once a file has been identified as a zip, there is no need to check if it is a text file, but it is worth checking if it is an Microsoft Office file.

To prevent loading entire files into memory, when detecting from a reader or from a file mimetype limits itself to reading only the header of the input.

structure

Performance

Thanks to the hierarchical structure, searching for common formats first, and limiting itself to file headers, mimetype matches the performance of stdlib http.DetectContentType while outperforming the alternative package.

                            mimetype  http.DetectContentType      filetype
BenchmarkMatchTar-24       250 ns/op         400 ns/op           3778 ns/op
BenchmarkMatchZip-24       524 ns/op         351 ns/op           4884 ns/op
BenchmarkMatchJpeg-24      103 ns/op         228 ns/op            839 ns/op
BenchmarkMatchGif-24       139 ns/op         202 ns/op            751 ns/op
BenchmarkMatchPng-24       165 ns/op         221 ns/op           1176 ns/op

Contributing

See CONTRIBUTING.md.

mimetype's People

Contributors

gabriel-vasile avatar dependabot[bot] avatar ppai-plivo avatar wheeskyjack avatar pippo avatar localleon avatar 0xbzho avatar joksas avatar ne1llee avatar sigma avatar ibraimgm avatar taiypeo avatar vansante avatar thinkofher avatar yintokey avatar waybackarchiver avatar kycklingar avatar fahadsiddiqui avatar dhanusaputra avatar darthpestilane avatar zabullet avatar antoinegirard avatar anonymous5l avatar andrewstucki avatar theotow avatar mdosch avatar n-vr avatar tebrizetayi avatar thibmeu avatar thomasobenaus avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.