Let say a dataset is like this:
Sno country noOfDeaths
1 India 3245325
2 America 234523
3 UK 3432523
3 UK 3432523
Here Sno 3 is duplicated, I want to remove this entire row.
3 UK 3432523
This last line should remove.
Here is my code how I'm reading the dataset:
data_reader.java
public class data_reader {
String filePath="src\\covid_19_data.csv";
BufferedReader reader=null;
String line="";
public void readDataSet() {
try {
reader=new BufferedReader(new FileReader(filePath));
while((line=reader.readLine())!=null) {
String[] row=line.split(",");
for(String index:row) {
System.out.printf("%-10s",index);
}
System.out.println();
}
}catch(Exception e){
e.printStackTrace();
}
finally {
try {
reader.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
}
main.java
public class Main {
public static void main(String[] args) {
data_reader obj=new data_reader();
obj.readDataSet();
}}
Please help how to do this.
Update:
public class data_reader {
String filePath="src\\abc.csv";
BufferedReader reader=null;
String line="";
String duplicateLine="";
Set<String> idSet = new HashSet<String>();
public void readDataSet() {
try {
reader=new BufferedReader(new FileReader(filePath));
while((line=reader.readLine())!=null) {
String[] row=line.split(",");
idSet.add(row[0]);
// for(String index:row) {
// System.out.printf("%-10s",index);
// }
System.out.println();
}
System.out.print(idSet);
}catch(Exception e){
e.printStackTrace();
}
finally {
try {
reader.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}}
output:
[1, 2, 3, Sno] it delete the last line which was duplicated
but how to print the output like this?
Sno country noOfDeaths
1 India 3245325
2 America 234523
3 UK 3432523
CodePudding user response:
Record
Define a record to hold your data.
record Sample ( int sno, String country, long noOfDeaths ) {}
As a record, the compiler implicitly creates overrides of the equals and hashCode methods. Those methods’ implementations consider each and every member field.
Set
Instantiate an object per row of input data. Collect into a Set. Sets disallow duplicate. Any duplicate being added is ignored.
Set< Sample > samples = new HashSet<>();
…
samples.add( new Sample( … ) ) ;
NavigableSet
You may want to sort your distinct objects. Use a NavigableSet such as TreeSet, passing a Comparator object to specify the desired ordering.
The getter methods in a record share the same name as their respective member field. The getters in a record do not follow the JavaBeans’ naming convention of get/is prefix.
NavigableSet < Sample > samples =
new TreeSet<>(
Comparator.comparingLong( Sample :: noOfDeaths )
);
…
samples.add( new Sample( … ) ) ;
CodePudding user response:
Create a Set<Integer> idSet where you put the Ids in.
If the idSet.add() method returns false delete the current line.
Adds the specified element to this set if it is not already present (optional operation). More formally, adds the specified element e to this set if the set contains no element e2 such that (e==null ? e2==null : e.equals(e2)). If this set already contains the element, the call leaves the set unchanged and returns false. In combination with the restriction on constructors, this ensures that sets never contain duplicate elements.
