Home > Blockchain >  How to read text in a file into a string variable in c
How to read text in a file into a string variable in c

Time:01-24

I have a csv file containing numbers separated by a commma each like 1,2,3,4,5,6,7,8,9..... i intend to parse all these numbers into a vector I have initialized in c programming language, I have managed to achieve the reading of all the charatcers in the file to the end of it with the code below, my problem now is assigning the characters being fetched with the getc() function to a string for further comma based processing where I will split the string into a string array using the comma as the separator variable. I have tried assigning the characters to the string with each iteration with no success, please help

#include <string.h>
#include <stdio.h>

int main() {
    // define the vector file
    FILE* vec = fopen("vectors.csv", "r");
    // define the dimensions of the vector
    int rows = 100;
    // this vector has 100 rows for instance
    int vec[rows];
    //I managed to read the csv file now I need to read it into a string
    char ch,line;
    while((ch=getc(vec))!=EOF){
    //append the characters to the empty string
    line =ch;
    }
 //confirm the accuracy of the string by outputting it
   printf("%s",line);
 //the above line outputs blank, please help
    



    return 0;
}

I always thought a vector was the same as a one dimension array but my research could have led me astray, please help.

CodePudding user response:

Converting a CSV record (a string) to an array is actually pretty easy. There are a couple of ways to do it, but using strtok() with a string→int function is perhaps the easiest.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main()
{
  char s[1000];
  FILE * f = fopen("vectors.csv", "r");
  while (fgets(s, sizeof(s), f))
  {
    int vector[100];
    int size = 0;

    const char * ds = " ,\t\n";  // skip spaces, commas, tabs, and that newline at the end of s[]
    for (char * token = strtok(s, ds);  token;  token = strtok(NULL, ds))
    {
      vector[size  ] = atoi(token);
    }

    do_something(vector, size);
  }
  fclose(f);
  return 0;
}

You may wish to add some error handling:

  • fopen() may return NULL
  • fgets() may not read the whole line (because there isn’t enough space in s[])
  • there may be more numbers per line than space available in vectors[]
  • atoi() may fail, but it won’t tell you if it does. strtol() is much better at error checking.

Notice also that your string→int function can be string→whatever: your vector could be composed of floating-point values, for example. You just need a function that will perform the conversion.

I have written it as a loop, assuming “vectorS” means that there are multiple records in your CSV, one vector per line, and that you wish to process all of them in turn. If your file is really just a single vector (on a single line) then just get rid of the loop.

If your file is massive to the point that using three passes over the string (fgets() plus strtok() plus atoi() / strtol() / scanf() / whatever) becomes an issue, you can do it in a single pass with just scanf(). Post a comment and I’ll update the answer.

Update

As your file contains nothing but numbers, separated by commas, as a single vector, you can also loop using just scanf().

#define __STDC_WANT_LIB_EXT1__ 1
#include <stdio.h>
#include <stdlib.h>

int main()
{
  int vector[100];
  int size = 0;

  FILE * f = fopen("vectors.csv", "r");
  while (fscanf_s(f, "%d ,", vector   size) == 1)
  {
    size  = 1;
  }
  fclose(f);

  do_something(vector, size);
}

While this looks very convenient, it cannot directly handle newlines in the input. If you were to wish to scan multiple vector records, one per line, you would still have to read the line in as a string, then parse the string. You can still do it with sscanf(), but it gets a tiny bit less easy to grok at first sight.

True single-pass parsing?

The update is a single-pass, but if you need to keep that (because you are trying to handle literally tens of thousands of values) and do it line-by-line, then you need to be a little sneakier when dealing with “delimiters”. Again, comment if it is an issue.

Vectors

In programming languages you can safely assume a vector some kind of list analogous to an array. The exact meaning of the term depends on either the language or the environment.

For example, vectors can be represented as tuples, arrays, lists, etc. In C a vector is a library object that represents a dynamic array. And so on.

In mathematical terms, a vector is a component of a linear space. But it is, again, often represented as a tuple of scalar values.

CodePudding user response:

Check the result of fopen to see if it succeeded. if(vec == NULL) { /* error handling here */ }.

In case there is a limited amount of numbers in the file, it will probably be easiest to read it all as one single string with fgets. Alternatively write a loop and call either fgets or fscanf repeatedly. In case you read the input with fgets you need to convert the numbers to integers with strtol (fscanf does this internally so it is easier to use).

You need to check the result of the functions to see if they succeeded, to prevent formatting errors as well as reading beyond the end of the file.

Remember to fclose the file handle when done.

CodePudding user response:

You will fail because your input (what you call a string is not string. In C there is no such thing as string but an array of chars ending with a null).

So this (the line variable in there): char ch,line;

cannot be used like:

line =ch;

It will just increase the char value of the "line" variable and go over and over again when it exceeds its limit (MAX_CHAR).

You better use fgets() in the first place and then malloc()/realloc()/free() memory for your char pointer by the amount of characters in each line (strlen()). So your variable should be: char *line;

Another solution could be to find out the count of lines in the file and then find out the maximum line length of those and then set your variable as:

char line[<max_line_length> <line_count> 1];

  •  Tags:  
  • Related