Giter Site home page Giter Site logo

http-server's Introduction

HTTP-Server

A HTTP-Server completly written in ansi-C. Though it only works on linux at the moment.

Code-Checks

CodeFactor Benchmark Results

Features

  • Serve static Files
  • Can be implemented into any other project to handle (GET) Requests
  • Can expose port 9001 to allow for monitoring using prometheus

Command line Usage

Simply compile the code and execute the resulting binary to start a simple Webserver

Command line Arguments

  • "-p [port]" sets the port where it should listen on (default is 80)
  • "-tc [count]" sets the number of threads used to handle requests (default is 100)
  • "-c" enables the Cache-Control-Header with a default Cache time of 1 Day
  • "-d" enables the Debug mode
  • "-m" enables the measuring of the performance of the most important functions (should only be used for optimization)
  • "-t" enables the Templating System
  • "-ic" enables the internal caching system, loads all files in the website/ directory on startup. Needs restart to reload them

Templating System

Features

  • include an html file in another html file

Usage

To include a file you simply add <--include path="path to file"/> to your html file. It will load the specified file and replace the tag with the content of said file.

Note: The path to the file has to start with a '/' otherwise it cant find it.

Error Handling

When a specified file can not be found it simply removes the tag.

Implementing into existing project

Adding it to the project

'All' you need is to add the 'src/webserver' folder to your project and include the 'server.h' header, this should include everything you need.

An example for this can be seen in the 'examples' directory, where you can also find different use cases.

Starting the Webserver

To create a Server you simple need to run: int serverFd = createServer(port);

To then start the server up run: startServer(serverFd);

Note: The startServer() function is going to block the Thread so if you want to have it in the background you need to execute it in another Thread.

Options

  • You can enable Debug mode by running setDebugMode(1);
  • You can enable the use of Cache-Headers by running setGeneralCaching(1);
  • You can enable the Measuring of the speed by running setMeasureExec(1); (mostly only used in optimizing the webserver itself)
  • You can enable the Templating System by running setGeneralTemplateUsage(1); (this only affects the serving of static files)

Adding custom Paths

Note: This feature is still in development, so it can still have some problems or unfinished parts. To add a Custom Path simply run addCustomPath(Method, Path, function); This will route every Request with the same Path as specified to the given function.

The function should look something like this int name(request* req, response* resp); and return a 0 if everything worked or a value that is smaller than 0 if something went wrong

Important Functions

setStatus(response, statusCode, statusMessage);

  • response: A pointer to the response struct
  • statusCode: An integer that represents the Status Code for the Response (ex. 200, 404, etc.)
  • statusMessage: The corresponding Status Message to the Status Code setData(response, responseData, dataLength);
  • response: A pointer to the response struct
  • responseData: A char* that contains the Response-Body, the value it points to needs to persist outside of your function and will be freed for you afterwards
  • dataLength: The Length of the responseData, as an integer setContentType(response, contentType, contentLength)
  • response: A pointer to the response struct
  • contentType: A char* that contains the Content-Type, this is only used for the 'Content-Type' Header
  • contentLength: The Length of the content itself, as an integer

http-server's People

Contributors

lol3rrr avatar

Stargazers

Christoph Flügel avatar

Watchers

James Cloos avatar

http-server's Issues

Use Request/Response Pools

Problem

Right now all the request and response instances are created when needed, before a new connection is accepted. This however just causes a bunch of mallocs and frees, that arent really needed overall, as the instance lifetime is pretty clear.

Solution

Introduce a Request/Response pool, that contains a number of request and response instances that are already ready to go. Once the request is done, the instances can simply be "cleaned up"/reset to their original state and added to the queue.
This could simply be implemented using a queue, where the objects are simply taken from the top and added to the bottom. If an instance is requested, but the queue is empty, it should simply create new instance, which will then afterwards be added to the queue like normal and thereby automatically adjusting the size of it to the demand on the website.
It is important to consider that these operations need to be thread-safe, so the queue should also hold a mutex/lock to synchronize the access across threads.

Add support for patterns in paths

Right now when using custom paths it only calls the given function when the paths match exactly

This should be made a bit more generic so for example the function for the path "/users" should also be called if the requested path is "/users/".
You should also be able to add paths like "/users/*" to handle every path that matches this pattern.

Note: It should still check for the exact matches before searching for patterns to improve performance.

Using some fixed size arrays

Problem

Right now all strings are dynamically allocated, this is the best option for memory consumption as it only uses what it actually needs. However from a performance point this is not the best, as you mostly try to reduce the amount of memory you allocate at runtime and especially for short strings it's honestly kind of a waste.

Solution

For certain things, there should be fixed size arrays.
One good example is the Request-Method as this has only a couple of options and even the worst case scenario only wastes a couple of bytes, so no big deal.
This however should help to keep the allocations lower and overall removes one problem that needs to be managed with that dynamic memory.

Switch to using treaps or similar for Header

Problem

Right now a linked list is used to store the Headers, which works but is not really the best as searching or inserting an element takes O(n) time.

Solution

Instead of a linked list it should use a treap or another similar data structure as the time should be faster. The optimal time would be O(log n) and at worst O(n). So overall it should be faster or at worst be the same, but even then it should help with making it more understandable.

Add the parsing of the Query params

Right now the query params are just part of the path.

The params should be parsed the same way headers are parsed and removed from the path.

Add tests

Problem

Right now none of the functionality is actually being tested and basically all bugs are found out of luck or in production, which is not good.

Solution

A testing framework/library like AceUnitor GNU Autounit should be used to test the important parsing functions first, as these are of the most priority, and then over time can the tests be expanded to hopefully cover everything or nearly everything.

Add the parsing of request bodys

Right now the request body is just being ignored

The Request-Body should also be part of the parsing of an incoming request and should have its own field in the request struct

Use thread pools

Problem

Right now the program creates a new thread for every single request, which is not really efficient in a normal sense.

Solution

Add a thread-pool, which will then handle all the requests. This thread pool can for the beginning be of a fixed size, but should later on be of dynamic size to adapt the size to the load it is experiencing. With a maximum and a minimum amount of threads in the pool.

Preventing use of '..' in paths

Problem

Right now the Server just takes the given path as a valid path and adds the "website" folder to the beginning and then returns the file found at the given path.
This can cause security problems if a '..' is used to navigate up a directory.

Solution

The Server should check for a '..' in the given path and if one is found return a '403 Forbidden' response to the client.

Alternate internal cache

Problem

Right now all the files are stored in memory, which works fine for most situations but can get out of hand for larger ones.

Solution

Instead of saving all the files in their "ready" state in memory. The Hashmap can simply contain an open file-descriptor for each file, which can then be used to read the data in 1024 byte chunks.
For the reading part pread should be used, as the offset/start to read from can be set without seeking to that position.

To allow for using templates, these files can be processed in the populateCache function, similar to how they are handled right now. They will then be written to some temporary file and a file-descriptor for that will be opened and stored in the Hashmap.
This makes sure that all the files are included correctly, but still allows us to use a single simple file to read from.

This solution should cut down the memory usage again, by a lot.

Add Dockerfile

Add a Dockerfile to automatically build a basic docker image of the webserver.

The docker image should serve as a basic starting point which can either be used to create a new one for each project or to simply add the website files through a volume

Switching socket model

Is your feature request related to a problem? Please describe.
Right now all requests are accepted from a single thread, which then adds them to a queue and a thread from the thread-pool will then take that request and actually handle it.
This however means, that there is always the memory allocation for the queue + the locks/mutexes needed for synchronising the accesses to set queue.

Describe the solution you'd like
Use the SO_REUSEPORT flag in unix to have each thread in the thread-pool listening on its own socket. The connectons are automatically split up between the different sockets, so that will not be something we need to worry about.
Each thread can then accept a connection and handle the request directly by itself, which should be at least a bit faster.

Additional context
This not only allows the application to accept multiple connections at the same moment, because they are split, but also reduces the memory allocations and use of locks/mutexes in the queue, which could otherwise become a real performance bottleneck, which itself could otherwise not really be removed or optimized in any meaningful way.

Resources
man
stack exchange thread

Add stats

Add Prometheus stats to it, when enabled with a compiler flag

Windows compatible

Is your feature request related to a problem? Please describe.
The code is currently only designed to work on unix and will need some functions as an abstraction for the underlying os-dependant functions.
This is actually only now possible with the thread-pool, because now all the unix-specific functions should have similar functions on windows.

Describe the solution you'd like
All the OS-specific functions, like sockets, threads and all these should have a small abstraction on top of it, which is then actually used in the code itself.
The functions will mostly be based on the unix ones and have a similar array and on compile them these functions are implemented for unix and windows using compile time conditionals. All this stuff should have it's own header, to better seperate all this OS stuff from the rest and kind of hide what is going on behind the scenes.

Additional context
This is not a high priority issue, but would still be a nice addition to this project.
These abstractions could then also be used to implement mocking functions for otherwise hard to test functions and therefor enable the use of more tests.

Weird crash with invalid free

Sometimes when handling a request, an invalid free() error occurs and the program crashes.
I'm unsure about what causes this issue, as it does not strictly seem to correlate with any specific file-size or type, although bigger ones tend have this more often.
This for now needs a closer look to actually find the root cause of it.

Switch to using threads

Problem

Right now each connection is handled by a new process, which works alright but is not the fastest / best option.

Solution

Switch over to using threads to handle each connection. This also allows us to only use malloc and not mmap, as the memory is already shared between the threads.

Ressources

GeeksforGeeks
StackOverflow 1
StackOverflow 2

Less Mallocs

Problem

Right now the code makes too much calls to malloc for my taste and it really slows down everything.

Solution

All the stuff for parsing the Request should be done on the one malloced array, that actually holds the request.
All the Headers then just simply hold a pointer to some value inside that block and how long it is supposed to be, this could really reduce the amount of malloc calls.
This also needs the headers to be aware of the fact that they only sometimes need to free their content and otherwise don't, this could be achieved using a simple boolean being passed to them on creation that indicates this.

Reduce shared memory

Right now still a lot of memory is allocated in a shared way, to allow different processes to access the data.
Since we now use threads instead of processes, it can now be moved back to normal allocated memory, which is easier to work with and most likely also faster.

Add log files

Right now all the logging is just to the console.

There should be a log file where everything is written to.
The messages that are logged should be the same that are logged to the console.
There should also be a timestamp add the beginning of every log message.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.