Home > Software engineering >  Am trying to read this csv file but I keep getting a range out of index, any ideas on how to avoid i
Am trying to read this csv file but I keep getting a range out of index, any ideas on how to avoid i

Time:01-23

Am trying to read this csv file that I will later choose the distinct elements in it and create a truth table if each element exists in a row I put 1 If not 0 The line/row length keeps isn't constant and that's what am having trouble with

chicken,other vegetables,packaged fruit/vegetables,condensed milk,frozen vegetables,fruit/vegetable juice
vinegar,oil
rolls/buns,soda,specialty bar
whole milk
pork,root vegetables,whole milk,whipped/sour cream
rolls/buns,newspapers
grapes,other vegetables,zwieback,Instant food products,dishes
frankfurter,citrus fruit,whipped/sour cream,cream cheese ,rolls/buns,baking powder,seasonal products,napkins
finished products,root vegetables,packaged fruit/vegetables,yogurt,specialty cheese,frozen vegetables,ice cream,soda,bottled beer
onions,other vegetables,flour
tropical fruit,whole milk,rolls/buns

Am using this following code

static void Main(string[] args)
{
    string filepath = @"C:\Downloads\groceriess.csv";
    DataTable res = ConvertCSVtoDataTable(filepath);

    DataTable ConvertCSVtoDataTable(string strFilePath)
    {
        StreamReader sr = new StreamReader(strFilePath);
        string[] headers = sr.ReadLine().Split(',');
        DataTable dt = new DataTable();
        foreach (string header in headers)
        {
            dt.Columns.Add(header);
        }
        while (!sr.EndOfStream)
        {
            string[] rows = Regex.Split(sr.ReadLine(), ",(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)");
            DataRow dr = dt.NewRow();
            for (int i = 0; i < headers.Length; i  )
            { //i did try i<rows.Length and also i < headers.Length && i <rows.Length
              dr[i] = rows[i]; //this is the line that is causing the error
            }
            dt.Rows.Add(dr);
        }
        return dt;
     }
}

CodePudding user response:

It looks like the code in the question is meant to read a CSV file with a header row and a fixed number of columns. However, that is not what your data is.

The best you can do is to store the data in a List<List<string>>, something like this:

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text.RegularExpressions;

namespace Groceries
{
    internal class Program
    {
        static List<List<string>> LoadGroceryList(string filename)
        {
            var groceryList = new List<List<string>>();

            using (var reader = new StreamReader(filename))
            {
                while (!reader.EndOfStream)
                {
                    var line = reader.ReadLine();
                    var items = Regex.Split(line, ",(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)");
                    if (items.Length > 0) { groceryList.Add(items.ToList()); }
                }
            }

            return groceryList;

        }

        static void Main(string[] args)
        {
            string filepath = @"C:\temp\Groceries.csv";
            var groceries = LoadGroceryList(filepath);

            var uniqueItems = groceries.SelectMany(x => x).Distinct().OrderBy(y => y);

            Console.WriteLine(string.Join("\r\n", uniqueItems));

            Console.ReadLine();

        }
    }
}

For the data in the question, that outputs:

baking powder
bottled beer
chicken
citrus fruit
condensed milk
cream cheese
dishes
finished products
flour
frankfurter
frozen vegetables
fruit/vegetable juice
grapes
ice cream
Instant food products
napkins
newspapers
oil
onions
other vegetables
packaged fruit/vegetables
pork
rolls/buns
root vegetables
seasonal products
soda
specialty bar
specialty cheese
tropical fruit
vinegar
whipped/sour cream
whole milk
yogurt
zwieback

CodePudding user response:

your code means number of items in a row must be exactly equal to header.length

but in your csv file each row has different length from other.

you must check number of item in each row different from header length.

  •  Tags:  
  • Related