Home > database >  How to spawn threads to make multiple posts at the same time
How to spawn threads to make multiple posts at the same time

Time:01-27

I have a .netcore 6 BackGroundService which pushes data from on-premise to a 3rd party API.

The 3rd party API takes about 500 milliseconds to process the API call.

The problem is that I have about 1,000,000 rows of data to push to this API one at a time. At 1/2 second per row, it's going to take about 6 days to sync up.

So, I would like to try to spawn multiple threads in order to hit the API simultaneously with 10 threads.

 var startTime = DateTimeOffset.Now;
 var batchSize = _config.GetValue<int>("BatchSize");
 using (var scope = _serviceScopeFactory.CreateScope())
 {
       var context = scope.ServiceProvider.GetRequiredService<PlankContext>();

       var dncEntries = await context.PlankQueueDnc.Where(x => x.ToProcessFlag == true).Take(batchSize).ToListAsync();
        
      foreach (var plankQueueDnc in dncEntries)
      {
         var response = await _plankConnector.InsertDncAsync(plankQueueDnc);
         context.PlankQueueDnc.Update(plankQueueDnc);
      }
      await context.SaveChangesAsync();
}

Here is the code. As you can see, it gets a batch of 100 records and then processes them one by one. Is there a way to modify this so this line is not awaited? I don't quite understand how it would work if it were not awaited. Would it create a thread for each execution in the loop?

var response = await _plankConnector.InsertDncAsync(plankQueueDnc);

I am clearly not up to speed on threads as well as the esteemed @StephanCleary.

So suggestions would be appreciated.

CodePudding user response:

In .NET 6 you can use Parallel.ForEachAsync to execute operations concurrently, using either all available cores or a limited Degree-Of-Parallelism.

The following code loads all records, executes the posts concurrently, then updates the records :

using (var scope = _serviceScopeFactory.CreateScope())
{
    var context = scope.ServiceProvider.GetRequiredService<PlankContext>();

    var dncEntries = await context.PlankQueueDnc
                                  .Where(x => x.ToProcessFlag == true)
                                  .Take(batchSize)
                                  .ToListAsync();
    await Parallel.ForEachAsync(dncEntries,async plankQueueDnc=>
    {
        var response = await _plankConnector.InsertDncAsync(plankQueueDnc);
        plankQueueDnc.Whatever=response.Something;
    };

    await context.SaveChangesAsync();
}

There's no reason to call Update as a DbContext tracks the objects it loaded and knows which ones were modified. SaveChangesAsync will persist all changes in a single transaction

  •  Tags:  
  • Related