How many times a word appears in a text?

Net2022 · ‎03-14-2022

Hello, I have googled and searched the forum, but I can't find anything to get my code right. I want to search a .txt file and see how many times a keyword appears on demand. I've tried using a loop with the match pattern function, but I'm not getting the correct results. I appreciate any help given.

crossrulz · ‎03-14-2022

You need to wire up the shift register to the "start index" input on the Match Pattern.

Personally, I would just use the Search And Replace String, setting the "replace all" to TRUE and then just use the "Number Of Replacements" output.

There are only two ways to tell somebody thanks: Kudos and Marked Solutions
Unofficial Forum Rules and Guidelines
"Not that we are sufficient in ourselves to claim anything as coming from us, but our sufficiency is from God" - 2 Corinthians 3:5

RTSLVU · ‎03-14-2022

Try this:

Honestly, crossrulz solution above is much better.

========================
=== Engineer Ambiguously ===
========================

paul_cardinale · ‎03-14-2022

One unpleasant complication is figuring out what delimits a word.

"If you weren't supposed to push it, it wouldn't be a button."

RTSLVU · ‎03-14-2022

@paul_cardinale wrote:

One unpleasant complication is figuring out what delimits a word.

Irrelevant, the program searches for an exact match and returns how many time that exact match was found.

========================
=== Engineer Ambiguously ===
========================

altenbach · ‎03-14-2022

@paul_cardinale wrote:

One unpleasant complication is figuring out what delimits a word.

I agree that the problem is much more complicated, because there are plenty of delimiters (space, tab, period, linefeed, double-quote, numbers, etc.) so it is important to only use consecutive strings from the correct lexical class. Just searching for a substring will give you plenty of false positives (e.g. searching for "trust" would also count the substring in "trustworthy" as a "word", etc.). You also need to ignore upper/lowercase.

Still, this is a simple homework problem to solve. Try it!

Some comments to the original code:

No, it is not OK to spin the outer loop millions of times per second, redlining one CPU core and heating your computer.
NO, it is not OK to place a wait inside the inner loop. Searching should progress as fast as possible (seconds, not hours for long files!)
Please don't maximize the front panel and diagram to the screen! (during debugging you need to see both, as well as the help windows!)
The indicator you call "frequency" has the wrong representation and does not display a frequency at all.
Your inner loop always finds the same word forever and will never stop unless the word exists zero times.
Since the keyword control should be constant during the inner loop, it belongs outside it.
Why do you have a shift register is you don't even look at its output?
etc.

LabVIEW Champion.

paul_cardinale · ‎03-14-2022

@RTSLVU wrote:

@paul_cardinale wrote:

One unpleasant complication is figuring out what delimits a word.

Irrelevant, the program searches for an exact match and returns how many time that exact match was found.

Certainly not irrelevant. Suppose the text contains " forum ", and you search for the word "rum" without regard for delimiters; you would get a match there in "forum". But that is not the word "rum" and shouldn't be counted as such.

"If you weren't supposed to push it, it wouldn't be a button."

paul_cardinale · ‎03-14-2022

@altenbach wrote:

@paul_cardinale wrote:

One unpleasant complication is figuring out what delimits a word.

I agree that the problem is much more complicated, because there are plenty of delimiters (space, tab, period, linefeed, double-quote, numbers, etc.) so it is important to only use consecutive strings from the correct lexical class. Just searching for a substring will give you plenty of false positives (e.g. searching for "trust" would also count the substring in "trustworthy" as a "word", etc.).

Still, this is a simple homework problem to solve. Try it!

And you have to make sure that words at the beginning and end don't get missed for lack of a delimiter on one side.

"If you weren't supposed to push it, it wouldn't be a button."

RTSLVU · ‎03-14-2022

@paul_cardinale wrote:

@RTSLVU wrote:

@paul_cardinale wrote:

One unpleasant complication is figuring out what delimits a word.

Irrelevant, the program searches for an exact match and returns how many time that exact match was found.

Certainly not irrelevant. Suppose the text contains " forum ", and you search for the word "rum" without regard for delimiters; you would get a match there in "forum". But that is not the word "rum" and shouldn't be counted as such.

Oh, good point... I hadn't thought about it that much.

========================
=== Engineer Ambiguously ===
========================

altenbach · ‎03-14-2022

See if this can give you some ideas.

It would be even easier to place all words and their count into a map (try it!).

NOTE that you should set the keyword control to "limit to single line", else a linefeed might sneak in. You might actually want to verify that only a single word is entered (not shown)!

LabVIEW Champion.

LabVIEW

How many times a word appears in a text?

How many times a word appears in a text?

Re: How many times a word appears in a text?

Re: How many times a word appears in a text?

Re: How many times a word appears in a text?

Re: How many times a word appears in a text?

Re: How many times a word appears in a text?

Re: How many times a word appears in a text?

Re: How many times a word appears in a text?

Re: How many times a word appears in a text?

Re: How many times a word appears in a text?