CONTAINS() v/s FIND() in terms of speed.

I wonder which one outputs/finishes its result faster, for example

NOT(CONTAINS(“He only needs two more cards to complete the set.”, “complete”))

*outputs a logical value

v/s

FIND(“complete”, “He only needs two more cards to complete the set.”) = 0

*outputs a logical value

The difference in speed with which these two operate is negligible. Both need to traverse the text string to find the first occurrence of the search string.

But these two have slightly different uses.

  • CONTAINS() simply reports the text is there or not.
  • FIND() tells you the position of that search string so that you can take another action - such as LEFT(), RIGHT(), MID() - to extract out a portion of the source text.
8 Likes

Thank you @WillowMobileSys for pointing out the slightly different uses that more than slightly enhances understanding of which to use for efficient applications.

1 Like

Thanks for coming by but it does not answer the question. I don’t care the term ‘negligible’ which yourself as a user cannot prove it. Also, I know what the expressions do things for me.

I need some more legit info.

Seriously?

Test it yourself if you don’t trust what I am saying. You likely won’t find any “legit” details, whatever that means, anywhere or from anyone else with regards to those two functions. No one will have tested the speed, It’s just not worth the effort.

I DO KNOW how those functions work and confident in my response. I say “negligible” because if you were to perform timing tests you will NEVER get the same exact speed readings - even for the same function over the same exact data. There are too many environment variables to say they run at exactly the same speed.

If you want the answer in terms of Big “O” analysis (look it up) - they are exactly the same!!

8 Likes

@WillowMobileSys just this: :clap:

@Swoopy you may want to give it a try by yourself with this tool:

Introducing Virtual Column calculation time in Per… - Google Cloud Community

Would you mind keeping up posted if you find a meaningful difference? Thank you

3 Likes

Each expression has its own reason to exist, most of them are provided as building block and used in composition with the others providing the desired outputs.

However, they sometimes could be used interchangeably. This’s why I came up with the question of speed because each expression has different algorithm and costs differently (in com-sci language). The performance may look slightly different in tiny-toy apps but would not be the case in huge apps packed with huge number of rows.

I hope you would know that the calculation results from the mentioned link is not constant because of many factors involved. I just need technical info from engineer team (if possible/available).

Don’t mind keep looking up post if they come by for the technical info.

In short & respect your concern, I just need technical info from engineer team (if possible/available).

As you say, each expression has its own reason to exist, as @WillowMobileSys pointed out and differentiated. As you are trying to find if a string is contained in another string and not where it’s located in that string the CONTAINS() operator would be more appropriate as it returns logical(true/false) value you need. Using the find operator would require determining whether the string contains the substring but would then have to calculate the position of the substring within the string and then the additional operation of converting a number to a logical which is unnecessary additional overhead. As for which would be faster, I am pretty sure that the function that returns the true false value directly instead of returning a number and then converting that number to a true false value would probably be faster.

2 Likes

Interesting reason, thanks.

IMO, the NOT(CONTAINS(…)) would be faster than FIND(…) = 0 because the NOT() would just replace the input value but the “=” would do a binary word comparison.

1 Like

They both will perform exactly the same - speed “negligible”. The most expensive part of these operations BY FAR, is finding, or not finding, the substring in the source text.

To do that, both operations MUST test each character starting from the beginning of the source text and marching down the list of characters one-by-one testing each time to see if the search string is satisfied.

In the case of the search string NOT being present, the entire source string will be tested - in both functions. The extra step of testing “= 0” or flipping FALSE to TRUE in the NOT situation, adds minuscule time to the overall run time of the functions.

They both perform the same.

3 Likes

CONTAINS() is case-insensitive, FIND() is case-sensitive.

6 Likes

Thanks for coming by.

That’s another fair reason the NOT(CONTAINS()) would process faster than FIND() = 0

Ummm… No. Case-sensitive comparison (char1 = char2) is faster than case-insensitive (upper(char1) = upper(char2)). I would expect FIND() to be faster than CONTAINS().

6 Likes

Ahh…right, after rethought your expectation would be correct, the CONTAIN() would somehow/somewhere embed the UPPER() or LOWER() in its algorithm, which would make it needs considerably more processing time.

Now comes to NOT(CONTAINS(…)) and FIND(…) = 0, my thought is that NOT() would be faster than “=” operator. After all there would be some compensation between these 2 expressions but I personally prefer FIND(…) = 0 when working with long-text.

Now that we are on this:

True vs 0=0
Isblank() vs =""
Isblank(List) vs Count(List)=0
Isnotblank() vs Not(Isblank())
Filter(Table) vs Select(Table[Id])

…to be honest, I don’t care, but if this topic is important for some, I don’t think these wouldn’t.

Btw @Swoopy , this is a community, so expect community-level discussion. If you want a technical answer, write to support and wait for as long as you can, send a PM to someone on the team or be patient and accept comments that could not fit on your ideals

3 Likes

Geez guys the person asked a very simple and straight fwd question for valid reasons and it is simple in that if you do not know then just say so.

So when a comment like “this is a community… community level discussion” is it referring to the support personnel expectations or the community members’ expectations and such assumptions may be construed as somewhat offensive and demeaning.

There are also, in addition, the whole series of EXTRACT() functions to consider and the question is why the differences? Is it just no-code convenience or speed?

As APPSHEET code is designed to run on multiple microprocessor-based platforms one presumes it cannot be optimized as is done on say DSP platforms where the difference between assembler and c-code is very marginal.

0.0001 scaled by 100000 is 10 secs so one can imagine.

For example, try Google to find out what programming language APPSHEET is written in?

I presume AS is written for no-code comfort rather than speed.

Does AS make use of any onboard graphics card for faster processing and is both interesting and valid for both techies and “community” members.

Please note I am just chipping in and not seeking any answer or debate.

I end with a quote from Praveen

The smartest move we’ve made so far: “Listening to customers. Not just what they’re saying, but what their true need is.”

The biggest mistake we’ve made so far: “If we’d listened to customers better, we’d have pivoted earlier.”

2 Likes

I’ve never judged anyone’s post for what level it’s on or if it matters for who or what. If I don’t care the post, I rather just ignore it, it’s just that simple.

Many questions have never been answered or solved. Then what level those posts are in? Are they worthless to be answered? or they may be too much technical/difficult for @SkrOYC

This community also includes AppSheet staffs that they may come by and join for any kinds of question as long as they relate to AppSheet. This’s why I posted this here for any replies from whom willing to share their knowledge or idea.

Any ideas regarding the question are always welcome.

1 Like

The cost of NOT(…) vs (… = 0) is likely to be comparable and negligible, and therefore insignificant in relation to the cost of … The significant cost is …

TRUE is better because no operation is involved.

Likely equivalent.

Are in fact equivalent. ISNOTBLANK() and FILTER() are macros.

Meaning why both EXTRACT(“X”, …) and EXTRACTX(…)? I’ve wondered that myself. My guess is just for convenience. I suspect EXTRACTX(…) is just a macro for EXTRACT(“X”, …).

To my knowledge, there is no native code in AppSheet–it is written entirely in interpreted or byte-code-compiled languages. At the very least, I believe there is very little native code. But I may be wrong.

This is exactly why virtual columns are a prime cause of performance problems.

As is true for most large projects, several languages are used.

I wouldn’t make that assumption, myself.

I don’t know, but I strongly doubt it.

From the pre-Google day, I believe. :disappointed_face:

4 Likes