Giter Site home page Giter Site logo

Messy whole program structure about gpusph HOT 6 OPEN

gpusph avatar gpusph commented on May 30, 2024
Messy whole program structure

from gpusph.

Comments (6)

Narcolessico avatar Narcolessico commented on May 30, 2024

Hi Jose,

the main structure of GPUSPH comes from a small simulator that was not even written in C; then ported to exploit the GPU with CUDA, then multi-GPU and multi-node. All of this with from a handful of main developers and a lot of small, precious contributions. Many contributions came by passionate researchers who do not have a background in professional software development and were often overwhelmed with tasks (you know how research works - one can't just work full-time "only" on maintenance for months).

You're not the first one realizing that the overall structure has large improvement margins :-) but it is less trivial to realize that there are a lot of ideas, list of improvements, discussions and tests in the TODO list, not to mention side activities like modeling and interpreting experiments. What is lacking is not awareness or good will, but rather what in the software industry is called the "capacity" (aka: there are very few contributors with very few time available), while deadlines end up increasing the "technical debt". The approach currently used is the same used in many in the industry: rather than an abrupt rewrite, a long, incremental improvement process. You will find a lot of traces of it in the commit history.

Every feedback is welcome but it is especially constructive when it ends with "I propose". Could you make a detailed proposal, possibly in form of code by forking the repository and implementing its main lines? I am sure the merge request will get all the attention it deserves.

Thank you for your feedback!

from gpusph.

Oblomov avatar Oblomov commented on May 30, 2024

In addition to what @Narcolessico wrote, there are both historical and practical reason for the current design. And while we are steadily moving towards an improved internal architecture, this is still done within the constraints imposed by available time and workforce, which as mentioned isn't exactly abundant.

Concerning specifically the current integrated relationship between user problems and main application, this is due to a choice made in finding an optimal compromise between development time and code performance. More in detail, the device code massively relies on templates to fence optional components (both code and variables), and avoid the mere support for additional features from having any impact at all on performance. The downside of this approach is that to support runtime selection of options requires the compilation of all (compatible) option combinations, which results in extremely long compile times. In fact, the choice to make the framework options a compile-time decision in the user problem came when, even with considerably less variety than we have today, it took several hours to come out with the independent main executable, which completely destroyed the performance of the edit/compile/run cycle during development.

I can't say I'm happy with the decision, and as a matter of fact, we do have a solution planned, which consists in moving away from the current “single source” approach for the device code into a “separate source” approach, which is supported in recent versions of CUDA via NVRTC.

Following up to @Narcolessico proposal to “put your money where your mouth is”, you could for example try and implement dynamic loading of specific instances of the device code via NVRTC, as a first step; this would subsequently allow the CUDASimFramework to be instantiated from within the GPUWorker, rather than the user problem, which is a prerequisite to untangle the user problem from the rest of GPUSPH.

from gpusph.

sanguinariojoe avatar sanguinariojoe commented on May 30, 2024

Hey guys!

First of all, regarding netiquette... No offence, but impolite and not-diplomatic language is the generally accepted approach in free software development. It is, because polite discussions are not pragmatic, and don't let evaluate the level of concern. Now (since #31) you have a fully open project, so I would expect this short of comments quite often.

Moving back to the issue itself...

You're not the first one realizing that the overall structure has large improvement margins :-) but it is less trivial to realize that there are a lot of ideas, list of improvements, discussions and tests in the TODO list

Then publish the TODO list in the repo, and state in README how feature requests will be eventually addressed. Anyway, I would say that a bad structured code is a major issue... At least if you really want to get 3rd party developers contributing.

I can't say I'm happy with the decision, and as a matter of fact, we do have a solution planned, which consists in moving away from the current “single source” approach for the device code into a “separate source” approach, which is supported in recent versions of CUDA via NVRTC.

That's good! But AFAIK the required changes are not that great (for this specific problem). Of course, the "host side" can be easily adapted, as can be appreciated in main.cc:381 line (you just need to figure out how to reallocate the NetworkManager code). Regarding the "device side", using CMake (see #35) should not be hard to pack all the deps (revamping a bit the declarations) into a library. Further details here:

https://devblogs.nvidia.com/building-cuda-applications-cmake/

Let me know!

from gpusph.

Oblomov avatar Oblomov commented on May 30, 2024

No offence, but impolite and not-diplomatic language is the generally accepted approach in free software development. It is, because polite discussions are not pragmatic, and don't let evaluate the level of concern.

That is not really the case. Most of the discussion in and around free software development, even when not-diplomatic and straight-to-the-point is not impolite. If you are unable to communicate without coming off as arrogant and sanctimonious, then we can do without the communication, especially when you fail at being both polite and pragmatic —the only result of which is the time is spent in this kind of useless discussions.

you have a fully open project, so I would expect this short of comments quite often.

So far you've been the only one, and you can expect anyone coming out as badly to be treated the same way.

Then publish the TODO list in the repo, and state in README how feature requests will be eventually addressed.

These are things that will happen in time, as time permits and things evolve.

Anyway, I would say that a bad structured code is a major issue... At least if you really want to get 3rd party developers contributing.

You'll find that there are several areas where contributions are possible despite the structural limitations of the current implementation.

But AFAIK the required changes are not that great (for this specific problem). Of course, the "host side" can be easily adapted, as can be appreciated in main.cc:381 line (you just need to figure out how to reallocate the NetworkManager code). Regarding the "device side", using CMake (see #35) should not be hard to pack all the deps (revamping a bit the declarations) into a library.

You are looking at the situation from the wrong side up. A better build system is completely irrelevant with the current GPUSPH structure. The current GPUSPH structure (as in: the relationship between the GPUSPH class itself, main and the user problems) cannot change until the problem is decoupled from the framework. The problem cannot be decoupled from the framework until dynamic device code loading via NVRTC is implemented.

Here's the pragmatic approach you can follow if you actually wish to contribute something useful instead of vague criticism from the peanut gallery:

  1. implement dynamic framework code loading via NVRTC as an alternative functional alongside the current compile-time “single-source” framework;
  2. add support for the NVRTC framework implementation to the Problem and XProblem class; among other things, this should provide an API to the user problems that need to load their own device code for the custom kernels needed by some options (e.g. open boundaries, CALC_PRIVATE postprocessing, etc);
  3. move the existing test problems to the NVRTC framework implementation;
  4. deprecate the existing single-source framework implementation.

If you manage to implement these preliminary steps fast enough, they could be merged for 5.0, which would allow us to obsolete the single-source framework for the next major release, allowing us to move forward in the refactoring for the libification of GPUSPH.

from gpusph.

sanguinariojoe avatar sanguinariojoe commented on May 30, 2024

So far you've been the only one, and you can expect anyone coming out as badly to be treated the same way.

Jajaja, that's great ;-)
As I said, this is not a personal attack... Even though you decided to go that way with comments like "If you are unable to communicate without coming off as arrogant and sanctimonious", or "vague criticism from the peanut gallery". Indeed a great mix of Straw Man and Ad hominem fallacies.

Maybe we can stop the weapon climbing at this point.

These are things that will happen in time, as time permits and things evolve.

I consider that a major issue too... It is not I don't understand your point... I really do... But you cannot reply you have not time to work on something because a long todo list, without publishing such todo list... That would make people feel you just simply don't care... Recent similar situations here and here.

BTW, you can add labels/tags to the issues. You know, like "feature request", "bug", "enhancement" and so on... Honestly, I don't do that in my own repos, but apparently people like it

You are looking at the situation from the wrong side up. A better build system is completely irrelevant with the current GPUSPH structure. The current GPUSPH structure (as in: the relationship between the GPUSPH class itself, main and the user problems) cannot change until the problem is decoupled from the framework. The problem cannot be decoupled from the framework until dynamic device code loading via NVRTC is implemented.

I completely disagree! Wanna you get an extensible code by means of NVRTC? perfect! I like it! In fact AQUAgpusph works like that. But I really can't see what has it to do with getting a main + libs structure... I made AQUAgpusph use that paradigm much before going modular/extensible.

For instance I consider src/problems/DamBreak3D.cu (probably not the most complex example, I know):

	const bool WET = get_option("wet", false);
	const bool USE_PLANES = get_option("use_planes", false);
	const uint NUM_OBSTACLES = get_option("num_obstacles", 1);
	const bool ROTATE_OBSTACLE = get_option("rotate_obstacle", true);
	const uint NUM_TESTPOINTS = get_option("num_testpoints", 3);
	// density diffusion terms: 0 none, 1 Ferrari, 2 Molteni & Colagrossi, 3 Brezzi
	const int RHODIFF = get_option("density-diffusion", 2);
(...)
	// Allow user to set the MLS frequency at runtime. Default to 0 if density
	// diffusion is enabled, 10 otherwise
	const int mlsIters = get_option("mls",
		(simparams()->densitydiffusiontype != DENSITY_DIFFUSION_NONE) ? 0 : 10);

Directly came from main...

	// ** framework setup
	// viscosities: KINEMATICVISC*, DYNAMICVISC*
	// turbulence models: ARTVISC*, SPSVISC, KEPSVISC
	// boundary types: LJ_BOUNDARY*, MK_BOUNDARY, SA_BOUNDARY, DYN_BOUNDARY*
	// * = tested in this problem
	SETUP_FRAMEWORK(
		viscosity<ARTVISC>,
		boundary<DYN_BOUNDARY>
	).select_options(
		RHODIFF == FERRARI, densitydiffusion<FERRARI>(),
		RHODIFF == BREZZI, densitydiffusion<BREZZI>(),
		RHODIFF == COLAGROSSI, densitydiffusion<COLAGROSSI>(),
		USE_PLANES, add_flags<ENABLE_PLANES>()
	);

	// will dump testpoints separately
	addPostProcess(TESTPOINTS);
(...)
	if (mlsIters > 0)
		addFilter(MLS_FILTER, mlsIters);

	// Explicitly set number of layers. Also, prevent having undefined number of layers before the constructor ends.
	setDynamicBoundariesLayers(3);

	resize_neiblist(128);

	// *** Initialization of minimal physical parameters
	set_deltap(0.015f);
	//simparams()->dt = 0.00005;
	physparams()->r0 = m_deltap;
	physparams()->gravity = make_float3(0.0, 0.0, -9.81);
	const float g = length(physparams()->gravity);
	const double H = 0.4;
	physparams()->dcoeff = 5.0f * g * H;
	add_fluid(1000.0);

	//add_fluid(2350.0);
	set_equation_of_state(0,  7.0f, 20.0f);
	set_kinematic_visc(0, 1.0e-2f);
	//set_dynamic_visc(0, 1.0e-4f);

	// default tend 1.5s
	simparams()->tend=1.5f;
	//simparams()->ferrariLengthScale = H;
	simparams()->densityDiffCoeff = 0.1f;
	/*physparams()->artvisccoeff =  0.2;
	physparams()->smagfactor = 0.12*0.12*m_deltap*m_deltap;
	physparams()->kspsfactor = (2.0/3.0)*0.0066*m_deltap*m_deltap;
	physparams()->epsartvisc = 0.01*simparams()->slength*simparams()->slength;*/

	// Drawing and saving times
	add_writer(VTKWRITER, 0.005f);

Can't this be librified?

	// *** Other parameters and settings
	m_name = "DamBreak3D";

	// *** Geometrical parameters, starting from the size of the domain
	const double dimX = 1.6;
	const double dimY = 0.67;
	const double dimZ = 0.6;
	const double obstacle_side = 0.12;
	const double obstacle_xpos = 0.9;
	const double water_length = 0.4;
	const double water_height = H;
	const double water_bed_height = 0.1;

	// If we used only makeUniverseBox(), origin and size would be computed automatically
	m_origin = make_double3(0, 0, 0);
	m_size = make_double3(dimX, dimY, dimZ);

Just attributes

	// set positioning policy to PP_CORNER: given point will be the corner of the geometry
	setPositioning(PP_CORNER);

	// main container
	if (USE_PLANES) {
		// limit domain with 6 planes
		makeUniverseBox(m_origin, m_origin + m_size);
	} else {
		GeometryID box =
			addBox(GT_FIXED_BOUNDARY, FT_BORDER, m_origin, dimX, dimY, dimZ);
		// we simulate inside the box, so do not erase anything
		setEraseOperation(box, ET_ERASE_NOTHING);
	}

	// Planes unfill automatically but the box won't, to void deleting all the water. Thus,
	// we define the water at already the right distance from the walls.
	double BOUNDARY_DISTANCE = m_deltap;
	if (simparams()->boundarytype == DYN_BOUNDARY && !USE_PLANES)
			BOUNDARY_DISTANCE *= getDynamicBoundariesLayers();

	// Add the main water part
	addBox(GT_FLUID, FT_SOLID, Point(BOUNDARY_DISTANCE, BOUNDARY_DISTANCE, BOUNDARY_DISTANCE),
		water_length - BOUNDARY_DISTANCE, dimY - 2 * BOUNDARY_DISTANCE, water_height - BOUNDARY_DISTANCE);
	// Add the water bed if wet. After we'll implement the unfill with custom dx, it will be possible to declare
	// the water bed overlapping with the main part.
	if (WET) {
		addBox(GT_FLUID, FT_SOLID,
			Point(water_length + m_deltap, BOUNDARY_DISTANCE, BOUNDARY_DISTANCE),
			dimX - water_length - BOUNDARY_DISTANCE - m_deltap,
			dimY - 2 * BOUNDARY_DISTANCE,
			water_bed_height - BOUNDARY_DISTANCE);
	}

	// set positioning policy to PP_BOTTOM_CENTER: given point will be the center of the base
	setPositioning(PP_BOTTOM_CENTER);

	// add one or more obstacles
	const double Y_DISTANCE = dimY / (NUM_OBSTACLES + 1);
	// rotation angle
	const double Z_ANGLE = M_PI / 4;



        // activate the solid obstacle
	for (uint i = 0; i < NUM_OBSTACLES; i++) {
		// Obstacle is of type GT_MOVING_BODY, although the callback is not even implemented, to
		// make the forces feedback available
		GeometryID obstacle = addBox(GT_MOVING_BODY, FT_BORDER,
			Point(obstacle_xpos, Y_DISTANCE * (i+1) + (ROTATE_OBSTACLE ? obstacle_side/2 : 0), 0),
				obstacle_side, obstacle_side, dimZ );
		if (ROTATE_OBSTACLE) {
			rotate(obstacle, 0, 0, Z_ANGLE);
			// until we'll fix it, the rotation centers are always the corners
			// shift(obstacle, 0, obstacle_side/2, 0);
		}
		// enable force feedback to measure forces
		enableFeedback(obstacle);
	}

	// Optionally, add a floating objects
	/*
	// set positioning policy to PP_CENTER: given point will be the geometrical center of the object
	setPositioning(PP_CENTER);
	GeometryID floating_obj =
		addSphere(GT_FLOATING_BODY, FT_BORDER, Point(water_length, dimY/2, water_height), obstacle_side);
	// half water density to make it float
	setMassByDensity(floating_obj, physparams()->rho0[0] / 2);
	setParticleMassByDensity(floating_obj, physparams()->rho0[0] / 2);
	// disable collisions: will only interact with fluid
	// disableCollisions(floating_obj);
	*/

	// add testpoints
	const float TESTPOINT_DISTANCE = dimZ / (NUM_TESTPOINTS + 1);
	for (uint t = 0; t < NUM_TESTPOINTS; t++)
		addTestPoint(Point(0.25*dimX, dimY/2.0, (t+1) * TESTPOINT_DISTANCE/2.0));

	for (uint t = 0; t < NUM_TESTPOINTS; t++)
		addTestPoint(Point(0.4*dimX, dimY/2.0, (t+1) * TESTPOINT_DISTANCE/2.0));

	for (uint t = 0; t < NUM_TESTPOINTS; t++)
		addTestPoint(Point(0.75*dimX, dimY/2.0, (t+1) * TESTPOINT_DISTANCE/2.0));

	for (uint t = 0; t < NUM_TESTPOINTS; t++)
		addTestPoint(Point(0.9*dimX, dimY/2.0, (t+1) * TESTPOINT_DISTANCE/2.0));

Is this not a geometry library?

I'm not as used as you to CUDA way of life, but are you sure you are not trying to make several changes in the same shot?

from gpusph.

Narcolessico avatar Narcolessico commented on May 30, 2024

Then publish the TODO list in the repo

Just a small remark on this point: there is no centralized TODO list AFAIK. There are TODOs in the code, some just orally discussed, some in shared documents. I would be useful to have them all together and prioritized, but at the time being there no single file to just push to the repo.

from gpusph.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.