I have function which gets arguments: char* filedata (stores whole file) and FILE *fp (the opened file).
void read_file(char *filedata, FILE *fp){
char buffer[1000];
while(fgets(buffer, sizeof(buffer), fp))
{
char *new_str;
if((new_str = malloc(strlen(filedata) strlen(buffer) 1)) != NULL)
{
new_str[0] = '\0'; // ensures the memory is an empty string
strcat(new_str, filedata);
strcat(new_str, buffer);
}
else
{
printf("malloc failed!\n");
}
strcpy(filedata, new_str);
}
fclose(fp);
}
But this isn't too fast... is there faster way to read the whole file?
CodePudding user response:
With some caveats, you can read the entire file into an appropriately-sized buffer in one fell swoop using the fread() function.
The following code outlines how to open the file, determine its size, allocate a buffer of that size, then read the file's data (all of it) into that buffer. But note the caveats about the fseek and ftell functions (discussed afterwards):
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
char* filename = "MyFile.txt"; // Or whatever
FILE* fp = fopen(filename, "rb"); // Open in binary mode
int seek = fseek(fp, 0, SEEK_END); // CAVEAT: Files in BINARY mode may not support SEEK_END ...
if (seek != 0) {
printf("Cannot fseek on binary file!\n");
fclose(fp);
return 1;
}
size_t filesize = (size_t)ftell(fp); // ... but this is not reliable if opened in TEXT mode!
char* filedata = calloc(filesize 1, 1); // Add 1 for the terminating "nul" character
rewind(fp);
fread(filedata, 1, filesize, fp); // Read whole file
// Clean up ...
fclose(fp);
free(filedata);
return 0;
}
Caveats:
Note that files opened in BINARY mode (as in the "rb" mode argument I gave in the fopen() call) are not required to support the SEEK_END origin in calls to fseek(); if this is the case on your platform, then this answer offers some alternatives to determine the file's size. From cppreference:
… Binary streams are not required to support SEEK_END, in particular if additional null bytes are output.
However, on the other hand, opening the file in TEXT mode (using "rt") will make the call to ftell effectively meaningless, in terms of the required size for your input buffer and the value specified to fread; from cppreference:
If the stream is open in text mode, the value returned by this function is unspecified and is only meaningful as the input to fseek().
CodePudding user response:
Annotating your function (not mentioning the leaks, etc) and counting the operations on the character buffers:
void read_file(char *filedata, FILE *fp){
char buffer[1000];
while(fgets(buffer, sizeof(buffer), fp)) // <<-- NEW_SIZE
{
char *new_str;
if((new_str = malloc(strlen(filedata) // <<-- OLD_SIZE
strlen(buffer) // <<-- NEW_SIZE
1)) != NULL)
{
new_str[0] = '\0'; // ensures the memory is an empty string
strcat(new_str, filedata); // <<-- OLD_SIZE
strcat(new_str, buffer); // <<-- NEW_SIZE
}
else
{
printf("malloc failed!\n");
}
strcpy(filedata, new_str); // <<-- OLD_SIZE NEW_SIZE
}
fclose(fp);
}
fgets(). strlen(), strcat() and strcpy() all need to loop over a character buffer.
Only the fgets() is actually needed, the rest of the copying can be avoided.
Adding the number of passes over the buffers:
sum of operations per loop: 3 * OLD_SIZE 4 * NEW_SIZE and: keep in mind that OLD_SIZE is actually SUM(NEW_SIZE), recursively, so your function has QUADRATIC behavior wrt the number of times the loop iterates.(basically the numner of lines read)
So you end up with:
Number of times a character is inspected
= 4 * N_LINE * LINE_SIZE
7 * (NLINE * (NLINE-1) ) * LINE_SIZE
;
[this is the "Schlemiel, the painter" story]
CodePudding user response:
Below's my function illustrating how I usually do it. Not sure how fast it is compared to all other possible C implementations. But I'd imagine they're all pretty similar unless poorly programmed in one way or another, which might lead to slower, less efficient execution.
/* ==========================================================================
* Function: readfile ( FILE *ptr, int *nbytes )
* Purpose: read open file ptr into internal buffer
* --------------------------------------------------------------------------
* Arguments: ptr (I) FILE * to already open (via fopen)
* file, whose contents are to be read
* nbytes (O) int * returning #bytes in returned buffer
* --------------------------------------------------------------------------
* Returns: ( unsigned char * ) buffer with ptr's contents
* --------------------------------------------------------------------------
* Notes: o caller should free() returned output buffer ptr when finished
* ======================================================================= */
/* --- entry point --- */
unsigned char *readfile ( FILE *ptr, int *nbytes ) {
/* ---
* allocations and declarations
* ------------------------------- */
unsigned char *outbuff = NULL; /* malloc'ed and realloc'ed below */
int allocsz=0, reallocsz=500000, /*total #bytes allocated, #realloc */
blksz=9900, nread=0, /* #bytes to read, #actually read */
buffsz = 0; /* total #bytes in buffer */
/* ---
* collect all bytes from ptr
* ----------------------------- */
if ( ptr != NULL ) { /* return NULL error if no input */
while ( 1 ) { /* read all input from file */
if ( buffsz blksz 99 >= allocsz ) { /* first realloc more memory */
allocsz = reallocsz; /*add reallocsz to current allocation*/
if ( (outbuff=realloc(outbuff,allocsz)) == NULL ) /* reallocate */
goto end_of_job; } /* quit with NULL ptr if failed */
nread = fread(outbuff buffsz,1,blksz,ptr); /* read next block */
if ( nread < 1 ) break; /* all done, nothing left to read */
buffsz = nread; /* add #bytes from current block */
} /* --- end-of-while(1) --- */
fclose(ptr); /* close fopen()'ed file ptr */
} /* --- end-of-if(ptr!=NULL) --- */
end_of_job:
if ( nbytes != NULL ) *nbytes = buffsz; /* #bytes in outbuff */
return ( outbuff ); /* back to caller with output or NULL*/
} /* --- end-of-function readfile() --- */
