I have a Datatable with several Columns which I want to remove all duplicates from like that
Dt1 = Dt1 .AsEnumerable().GroupBy(r => new { filename = r.Field<string>("filename1"), filesize = r.Field<string>("filesizeinkb") }).Select(g => g.First()).CopyToDataTable();
However above code leaves one entry (the first one that is found) in the DataTable via the Select.First at the end of the LINQ code.
Is there a way to remove all duplicates and keep none?
Edit: Example what the code is doing now and what it should do.
Datatable with entries like that
| Name | Filesize | Filename |
|---|---|---|
| One | 50 | Fileone |
| Two | 50 | Fileone |
| Three | 50 | Filetwo |
| Four | 50 | Filethree |
Above LINQ will now remove Line 2 as Filename and Filesize are the same. However Line 1 will stay as the LINQ Code selects the first duplicated entry.
I want to have removed line 1 and line 2 from the Datatable.
CodePudding user response:
Dt1 = Dt1.AsEnumerable()
.GroupBy(r => new { filename = r.Field<string>("filename1"), filesize = r.Field<string>("filesizeinkb") })
.Where(g => g.Count() == 1)
.Select(g => g.First())
.CopyToDataTable();
That will discard any groups with more than one item, then get the first (and only) item from the rest.
CodePudding user response:
Note: This was blindly typed here, so there might be some typos in the code.
Idea is, get the number of rows inside your DataTable, and go trough each of them, and do what you already did.
int NumOfItems = Dt1.AsEnumarable().ToList();
for(int i = 0; i < NumOfItems.Count; i )
{
Dt1 = Dt1 .AsEnumerable().GroupBy(r => new { filename = r.Field<string>("filename1"), filesize = r.Field<string>("filesizeinkb") }).Select(g => g.First()).CopyToDataTable();
}
