Giter Site home page Giter Site logo

Comments (14)

martonb avatar martonb commented on June 5, 2024 1

I finally figured it out... So, the server table I was syncing from didn't actually have IDENTITY on the ID field (!), my SQL colleagues explained tables are split up and the ID is generated via a separate table, on production (but my local data model, where I was syncing to, still contained IDENTITY).
Because of this, millions of exceptions were thrown (1x/row, 7,609,471 rows in the source table), filling up the error rows list
var errorsRows = new List<(SyncRow SyncRow, Exception Exception)>()
causing massive slowdown of UPSERTS files processing and memory growing excessively.
Recreated my client table, now it all seems to work fine...

If I may suggest, for the production version, as soon as the errors count hit a preset count N, just abort the whole sync, so the app doesn't crash with Out Of Memory?

Thank you for your support!

from dotmim.sync.

martonb avatar martonb commented on June 5, 2024 1

Also, about the slowness you also noticed: I changed the retention period to 1 MINUTE in my client (destination) database (as we only sync from Server to Client) and the processing of the UPSERTS files became faster. So it looks like MS SQL Server's internal tracking is slowing the syncing down.

from dotmim.sync.

Mimetis avatar Mimetis commented on June 5, 2024

I'm not able to reproduce it.
Can you provide a small sample I can use to reproduce this memory leak ?

from dotmim.sync.

martonb avatar martonb commented on June 5, 2024

The table that causes this currently has 4,643,451 rows, totaling 1,626.703 MB (aka ~1.6 GB), so small sample is not ;)
But you probably could generate a table similar (half the size would do I think) with dummy data in it.
I left the fixed version run overnight, still crashed for some reason, running it now from debugger, to get a call stack on it.

Using:

  • MS SQL Server 2008
  • SqlSyncChangeTrackingProvider for client and server too
  • SyncDirection.DownloadOnly
  • BatchSize = 6 * 1024

I could provide the exact test application, if it helps, but not the data from the tables, as it is client's GPS coordinates and such.

from dotmim.sync.

Mimetis avatar Mimetis commented on June 5, 2024

I have two tables with 2 millions rows.

Here is the code I used to create two heavy tables with 1M rows each:

USE [master]
GO 

-- Server database
if (exists (select * from sys.databases where name = 'AdventureWorks'))
Begin
	ALTER DATABASE [AdventureWorks] SET  SINGLE_USER WITH ROLLBACK IMMEDIATE;
	DROP DATABASE [AdventureWorks]
End
Create database [AdventureWorks]
Go

-- Client database. No need to create the schema, Dotmim.Sync will do
if (exists (select * from sys.databases where name = 'Client'))
Begin
	ALTER DATABASE [Client] SET  SINGLE_USER WITH ROLLBACK IMMEDIATE;
	DROP DATABASE [Client]
End
Create database [Client]
Go
 
USE [AdventureWorks]
GO

CREATE TABLE [dbo].[ProductCategory](
	[ProductCategoryID] [uniqueidentifier] NOT NULL,
	[ParentProductCategoryID] [uniqueidentifier] NULL,
	[Name] [nvarchar](50) NOT NULL,
	[rowguid] [uniqueidentifier] ROWGUIDCOL  NULL,
	[ModifiedDate] [datetime] NULL,
	[IsActive] [bit] NOT NULL DEFAULT(1),
 CONSTRAINT [PK_ProductCategory_ProductCategoryID] PRIMARY KEY CLUSTERED ([ProductCategoryID] ASC)) ON [PRIMARY]

/****** Object:  Table [dbo].[ProductDescription]    Script Date: 07/10/2017 15:28:49 ******/
CREATE TABLE [dbo].[ProductDescription](
	[ProductDescriptionID] [int] IDENTITY(1,1) NOT NULL,
	[Description] [nvarchar](400) NOT NULL,
	[rowguid] [uniqueidentifier] ROWGUIDCOL  NOT NULL,
	[ModifiedDate] [datetime] NOT NULL,
 CONSTRAINT [PK_ProductDescription_ProductDescriptionID] PRIMARY KEY CLUSTERED ([ProductDescriptionID] ASC)) ON [PRIMARY]
GO


INSERT [dbo].[ProductCategory] ([ProductCategoryID], [ParentProductCategoryID], [Name], [rowguid], [ModifiedDate], [IsActive]) 
VALUES ('cfbda25c-df71-47a7-b81b-64ee161aa37c', NULL, N'Bikes', newid(), CAST(N'2002-06-01T00:00:00.000' AS DateTime), 1)

INSERT [dbo].[ProductCategory] ([ProductCategoryID], [ParentProductCategoryID], [Name], [rowguid], [ModifiedDate], [IsActive]) 
		VALUES (NEWID(), 'cfbda25c-df71-47a7-b81b-64ee161aa37c', convert(varchar(100), N'Bike-' + Lower(convert(varchar(36), newid()))),newid(), CAST(N'2023-06-01T00:00:00.000' AS DateTime), 1)
go 1000000

go
INSERT [dbo].[ProductDescription] ([Description], [rowguid], [ModifiedDate]) 
VALUES (convert(varchar(100), N'Bike description-' + Lower(SUBSTRING(convert(varchar(36), newid()), 1, 4))),newid(), CAST(N'2023-06-01T00:00:00.000' AS DateTime))
go 1000000

ALTER TABLE [dbo].[ProductDescription] ADD  CONSTRAINT [AK_ProductDescription_rowguid] UNIQUE NONCLUSTERED ([rowguid] ASC)
ALTER TABLE [dbo].[ProductCategory] ADD  CONSTRAINT [AK_ProductCategory_Name] UNIQUE NONCLUSTERED ([Name] ASC)
ALTER TABLE [dbo].[ProductCategory] ADD  CONSTRAINT [AK_ProductCategory_rowguid] UNIQUE NONCLUSTERED ([rowguid] ASC)
ALTER TABLE [dbo].[ProductCategory]  WITH CHECK 
ADD CONSTRAINT [FK_ProductCategory_ProductCategory_ParentProductCategoryID_ProductCategoryID] 
	FOREIGN KEY([ParentProductCategoryID]) 
	REFERENCES [dbo].[ProductCategory] ([ProductCategoryID])
ALTER TABLE [dbo].[ProductCategory] CHECK CONSTRAINT [FK_ProductCategory_ProductCategory_ParentProductCategoryID_ProductCategoryID]

Then the sync is pretty simple:

var clientProvider = new SqlSyncProvider(DBHelper.GetDatabaseConnectionString(clientDbName));
var serverProvider = new SqlSyncProvider(DBHelper.GetDatabaseConnectionString(serverDbName));
var setup = new SyncSetup("ProductCategory", "ProductDescription");
// avoiding reference check constraint failing on product category table fk 
var options = new SyncOptions { DisableConstraintsOnApplyChanges = true };

var progress = new SynchronousProgress<ProgressArgs>(s =>
    Console.WriteLine($"{s.ProgressPercentage:p}:  " +
    $"\t[{s?.Source?[..Math.Min(4, s.Source.Length)]}] {s.TypeName}: {s.Message}"));

var agent = new SyncAgent(clientProvider, serverProvider, options);

// output batch applied
agent.LocalOrchestrator.OnBatchChangesApplied(Console.WriteLine);

try
{
    Console.ForegroundColor = ConsoleColor.Green;
    var s = await agent.SynchronizeAsync(setup, progress);
    Console.WriteLine(s);
}
catch (Exception e)
{
    Console.ResetColor();
    Console.WriteLine(e.Message);
}

Using a Performance analysis, I'm under 97 Mo of memory used, without any leaks.

image

Note: I did not used SqlSyncChangeTrackingProvider. For some reasons it's slower... I'm searching why (and honestly I don't see why)

Can you test this code and confirm you have the same behavior ?

from dotmim.sync.

martonb avatar martonb commented on June 5, 2024

I used the SqlSyncChangeTrackingProvider because it doesn't need the extra _tracking tables and change tracking is supposedly lower overhead on the SQL Server.
Will try your way tomorrow morning and let you know. Thanks!

from dotmim.sync.

martonb avatar martonb commented on June 5, 2024

Running with SqlSyncProvider causes errors in the application writing in the source database:
SqlException: Transaction (Process ID 97) was deadlocked on lock resources with another process and has been chosen as the deadlock victim.
As our use case is to keep the 2 databases in sync while the source is changed by an other app, I think this option will not work in this case...

Memory usage was indeed stable, though!

But my original fix (runner.DisposeAsync()) is still needed, IMO, as that class does implement IDisposable, but Dispose it's not called on it.

from dotmim.sync.

Mimetis avatar Mimetis commented on June 5, 2024

SqlException: Transaction (Process ID 97) was deadlocked

Can you provide a small sample to see the deadlock ? (ideally starting from my sample)

But my original fix (runner.DisposeAsync()) is still needed

I need to check for runner.DisposeAsync() because basically runner.CommitAsync() does the almost the same thing (it is not disposing the connection itself)

Memory usage was indeed stable, though!

Does it means, we have a memory leak using SqlSyncChangeTrackingProvider and not SqlSyncProvider ?

from dotmim.sync.

martonb avatar martonb commented on June 5, 2024

Does it means, we have a memory leak using SqlSyncChangeTrackingProvider and not SqlSyncProvider ?

It definitely looks like it, as with SqlSyncProvider memory usage is stable, but with SqlSyncChangeTrackingProvider I run out of memory...

from dotmim.sync.

Mimetis avatar Mimetis commented on June 5, 2024

Ok, still need a sample to reproduce it, because it's not the case on my side

from dotmim.sync.

martonb avatar martonb commented on June 5, 2024

So, I tried the database provided by you, then tried again on mine: I cannot reproduce the OutOfMemory exception any more.
I went over my code multiple times, the only thing I could think of, is that I had DisableConstraintsOnApplyChanges on false and it generated so many exceptions, it crashed in the end? Or I had mixed provisioning? Not sure...
If I manage to get back to the same situation, I'll zip up my sources and attach it here.
Thank you for your assistance!

from dotmim.sync.

martonb avatar martonb commented on June 5, 2024

I just reproduced it again, on one of our tables, super strange...
At beginning, memory was stable, then at the later stages I noticed my mouse getting lazy, opened Task Manager and voila :O

Dotmim Sync_Console
Dotmim Sync_RAM

from dotmim.sync.

Mimetis avatar Mimetis commented on June 5, 2024

That's a very good idea !
I think defining a threshold limit for errors can preserve a little bit your memory :)
I will think about it

from dotmim.sync.

VagueGit avatar VagueGit commented on June 5, 2024

I upvoted the comment by @martonb

If I may suggest, for the production version, as soon as the errors count hit a preset count N, just abort the whole sync, so the app doesn't crash with Out Of Memory?

We recently had an issue when a sync merge statement for a single user threw millions of errors causing the server cpu to max out. So to protect the server for all users, it would be a good idea to have sync abort after a preset number of errors.

from dotmim.sync.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.