01-12-2015 09:39 AM
yor way is fast but we have lost support for value with CR inside it, am i wrong?
thank for the help
01-12-2015 09:43 AM
no, i'm wrong. it's fully working.
the only thing i have to correct is the clear string
//clear strings for(i = 0; i < numero_stringhe; i++) stringa_destinazione[i][0]=0;
whit my one
for(i = 0; i < numero_stringhe; i++){ lunghezza_stringa = strlen(stringa_destinazione[i]); memset(stringa_destinazione[i], 0, lunghezza_stringa); }
because with your only the first character is set to 0
01-12-2015 09:50 AM
I have not tried to insert a CR but it should be managed here (I hope it works) :
case '\r': case '\n': if(inquote){ //cariage return or line feed inside quote //insert line break and parse next line stringa_destinazione[stringa_in_corso][index_stringa_in_corso++] = '\r'; stringa_destinazione[stringa_in_corso][index_stringa_in_corso++] = '\n'; carattere = lines[++riga_partenza]; } else //end of line stringa_destinazione[stringa_in_corso][index_stringa_in_corso] = 0; //terminate string break;
I think there is still room to improve the parser, the first step that split the buffer into lines is not required. The state machine can handle all CR so it is possible to replace the array of lines by a buffer containing all the file content.
Regards,
Stef
01-12-2015 09:53 AM
You don't have to clear all the string content, in C a string is terminated by a 0. So if you write a 0 at the first position the string is considered as empty.
01-12-2015 09:57 AM
ok but if i have in memory something like "351.12" and i set to '0' only the first char i will have "'0'51.12" in memory.
after if i change the value of that string with, for example "1.2" i will have "1.2.12" and i will have an error in the afot()
01-12-2015 10:04 AM
if your string contains '351.12', in memory you get 351.12\0 . When you will write '1.2' inside the string in memory you will get 1.2\012\0, and your string will end at the first encoutered 0. The only thing that you have to take care is to terminate properly your string by a 0 character.
01-12-2015 10:13 AM
sure, i don't know why i was thinking that!
too much Panettone during holydays!
01-12-2015 10:29 AM
one error found in my code:
when i look for the "#VALUE:" string and i find it i have to increment the variable by 1 because i use it not only to get the first numeric value but also tu calloc the buffer that contains it
do{ risultato_find_pattern = FindPattern(lines[numero_riga_value], 0, -1, "#VALUE:", 0, 1); }while( (risultato_find_pattern < 0) && (numero_riga_value++ < 100) ); numero_riga_value++;
01-13-2015 01:18 AM
hi stef, i've seen that one control in the CSV reader is missing.
in my version i check if i've reached the maximum number of readable column and i will parse it but don't store.
in your version there isn't this control and if i have more column than expected i will get an overflow error.
now i have to do some other job, when i have some time i will fix it
01-13-2015 01:22 AM
sorry for the flood, it was easy and i have already done it
int leggi_riga_csv_v2(char **lines, int riga_partenza, char *stringa_destinazione[], int numero_stringhe, int formato) { char delimitatore[2] = {',',';'}; int stringa_in_corso = 0; int index_stringa_in_corso = 0; int inquote = 0; int i = 0; int error = 0; char *carattere = NULL; for(i = 0; i < numero_stringhe; i++){ stringa_destinazione[i][0]=0; } //Point to begining of current line carattere = lines[riga_partenza]; index_stringa_in_corso = 0; while(*carattere && !error) { switch(*carattere){ case '\"': if(index_stringa_in_corso == 0){ // se come primo carattere ho una " allora e' una stringa speciale inquote = 1; carattere++; //skip quote carattere++; //get next character } else{ if(inquote){ //Check for double quote carattere++; if(*carattere == '\"'){ if(stringa_in_corso < numero_stringhe){ stringa_destinazione[stringa_in_corso][index_stringa_in_corso++] = *carattere; //concatenate quote } } else if((*carattere != 0) && (*carattere != delimitatore[formato])){ error = 1; //Quote string not followed by delimiter or end of string ! } else{ //end of quoted string if(stringa_in_corso < numero_stringhe){ stringa_destinazione[stringa_in_corso][index_stringa_in_corso] = 0; //terminate string } //parse next string stringa_in_corso++; index_stringa_in_corso=0; } } else error = 1; //Quote inside unquoted string ! } break; case ',': if(formato == 1){ if(stringa_in_corso < numero_stringhe){ stringa_destinazione[stringa_in_corso][index_stringa_in_corso++] = '.'; //replace , by . } } else { if(stringa_in_corso < numero_stringhe){ stringa_destinazione[stringa_in_corso][index_stringa_in_corso] = 0; //terminate string } //parse next string stringa_in_corso++; index_stringa_in_corso=0; } carattere++; break; case ';': if(formato == 0){ if(stringa_in_corso < numero_stringhe){ stringa_destinazione[stringa_in_corso][index_stringa_in_corso++] = ';'; } } else { if(stringa_in_corso < numero_stringhe){ stringa_destinazione[stringa_in_corso][index_stringa_in_corso] = 0; //terminate string } //parse next string stringa_in_corso++; index_stringa_in_corso=0; } carattere++; break; case '\r': // CR = 0x0D = 13 case '\n': // LF = 0x0A = 10 if(inquote){ //cariage return or line feed inside quote //insert line break and parse next line if(stringa_in_corso < numero_stringhe){ stringa_destinazione[stringa_in_corso][index_stringa_in_corso++] = '\r'; stringa_destinazione[stringa_in_corso][index_stringa_in_corso++] = '\n'; } carattere = lines[++riga_partenza]; } else{ //end of line if(stringa_in_corso < numero_stringhe){ stringa_destinazione[stringa_in_corso][index_stringa_in_corso] = 0; //terminate string } } break; default: //copy others characters in destination string if(stringa_in_corso < numero_stringhe){ stringa_destinazione[stringa_in_corso][index_stringa_in_corso++] = *carattere; } carattere++; break; } } return riga_partenza; }