read character from file

shivels · ‎10-28-2007

Hi,

I coded a program that reads a given integer. The integer represents a character location in another file. The program then opens a text file. The program is supposed to goto the integer laction and read 500 characters past that location. I am unable to get the program to goto the correct location in the file. If I give an integer value like 135159, the program goes the location it claims is correct, but it is not. Are there any constant values that LV uses when reading file locations? In my file there are 71 characters per line, 70 real characters and 1 line feed.

Adam

Matthew_Kelton · ‎10-28-2007

You didn't include your Read Characters from File1.vi.

shivels · ‎10-28-2007

Hi,

Here is the read from characters1.vi

Adam

Matthew_Kelton · ‎10-28-2007

Well, it's difficult to tell what you are expecting to get, since you didn't tell us, but I wonder if your starting index takes into account the two header lines in the chrom1.p1 file? The file starts with these two lines:

;Human Chromosome 1
CONTIG-1p1

And then has the list of character codes. If your offset in your other string is based on the start of data, not the start of the file, then, for this file, you will be 31 bytes off. If this header is constant, then you can add 31 to your number. If this header can change, then you'll need to determine its length.

If these files aren't that big, you should load the file, then process it instead of doing constant file I/O to get one byte at a time.

shivels · ‎10-28-2007

Hi,

The header stays constant. The program is supposed to open a text file. Contained in the text file is the location of the sequence match in chrom1.p1 based on character postion and the length of the match. The program then calculates the sequence match postion - 500 and the ((sequence match postion+ length) + 500). The program is the supposed to open chrom1.p1, goto the character postion and count the nucelotides until the program reaches 500. It does this twice, once for the 500 before and once for the 500 after. When I run the program for the first match, which begins at character postion 357750 and ends at character postion 358005, I had to add a constant of (5141+(357750-500)+ iteration) to get it to go to the correct location. For the rest of the matches, using this constant makes them all off.

Adam

RavensFan · ‎10-28-2007

I think you have a lot of unnecessary complication in your code. First I would recommend cleaning up all the wiring. Use right click clean wire to get rid of unnecessary bends. Line up your functions so that data flows from left to right without an excessive number of bends. There is a log of iterative string searching and cleaning going on in the upper left. This could probably be looped, or a scan from string function could return a lot of these values in a single step. But is it possible that your counts are off because of characters that you are or aren't stripping out on the preliminary code? Also, local variables would be better to use than the Value property nodes. Shift registers would be better still.

There seems to be no reason to read or write to the files one character at a time. The code would be a lot more efficient if you read all 500 characters at once. Then do your search routines. You know you are reading exactly 500 characters because you are reading 1 at a time and executing the loop exactly 500 times. Each execution is doing an open, read and close of the file.

Since it seems like your problems are in determining the start position, concentrate on that preliminary code where you calculate the start position. Put indicators on the various strings and numeric values to see if those functions are performing they way you want. Since you are having issues with the start position, perhaps the errors are in the math that calculates the start position. Your second flat sequence doesn't seem to take into account the length of the search string. It looks like it takes a part of the search string and converts it to a number. Maybe the error is in the data in that string control and not in the calculations.

Message Edited by Ravens Fan on 10-28-2007 09:34 PM

Matthew_Kelton · ‎10-28-2007

The problem is your offset doesn't take into account the file header or the end of lines. So, when you read from file, it includes all those characters. I wrote a little VI that will calculate the "true offset." Now, one point of confusion for me is that I had to enter the original spot (in your example 357750) to get what you said is the right point for the -500 spot. I would have thought I would have to take the 500 into account, as going 500 back also covers line feeds (in this case 7), but if you say that the spot you have is the correct spot, then do this with the desired offset not included.

You can avoid this by reading the file in, stripping the header, and removing all the line feeds. Your data should be one contiguous stream at that point. But, you will need to determine if loading the whole file to do this is practical. With this calculation, you can do it a byte at a time if you wish.

Message Edited by Matthew Kelton on 10-28-2007 10:17 PM

LabVIEW

read character from file

read character from file

Re: read character from file

Re: read character from file

Re: read character from file

Re: read character from file

Re: read character from file

Re: read character from file