laitimes

Big factories must fight for the place! AI search product 10,000-word long article analysis

Many years ago, Baidu, Google, etc. began to do intelligent search, but at that time, intelligent search was just a simple sorting and summarizing of search results, and the effect was not satisfactory. With the current explosion of AI, will search be more powerful with the blessing of AI? Actually, no, at least the current way of using AI search is not ideal. For example, these cases shared by the author of the text.
Big factories must fight for the place! AI search product 10,000-word long article analysis

This article will start with a negative case of 360AI search and discuss three major issues:

1. Why AI search products have become a consensus

2. Guess the evolution direction of AI search products

3. The core experience and influencing factors of AI search products

The full text is about 15,000 words, remember to collect it if you can't finish reading it~~

—-Body Dividing Line—-

Recently, when I was using 360AI search, I encountered a serious product design problem, which greatly affected the experience.

On the 360AI search Home, similar to ordinary search products, there are two modules: information flow and today's hot search.

Big factories must fight for the place! AI search product 10,000-word long article analysis

Yesterday, I happened to see a hot search that was [salary 4,000 months of work 300h], this title is really attractive, so the author opened this hot search to see the details, and the content of the page opened was like this:

Big factories must fight for the place! AI search product 10,000-word long article analysis

The title and the content don't match at all!

The reason is also simple, 360AI search uses large model capabilities to process the input information, instead of clicking on this hot search news to open a link directly.

According to the practice of traditional search engines, the keyword "monthly salary of 4000 and work of 300 hours" is generally matched to various news websites, and then users open the news website to view the detailed content.

360 should only pass the headline of hot news to the model processing workflow, and the result is that the donkey's head is not in the horse's mouth. (It may be out of cost-saving practices, the title and all the text may consume a hundred times the token, but the follow-up test seems to have overturned this speculation)

There is more than one experience problem here, let's analyze them one by one:

1. Is the news information suitable for processing and presentation with a large model?

The advantages of using large models to process news information are very obvious, which can summarize and extract key points from a large number of news content in a short period of time, saving users' time. It can also reduce the dependence on human editors and reporters for enterprises, reducing costs.

But the question is: do users necessarily need to save time when reading the news? It may be easier to understand by analogy with an online novel, and a famous online text Mortal Immortal Cultivation Biography can also be summed up in one sentence: Han Li (the protagonist) finally successfully ascended to the immortal realm after thousands of years of cultivation, and the whole book is finished. The novel here can be compared to entertainment news, but unfortunately entertainment news is far more than serious news in the proportion of all news that is read.

When the details of the content are lost after being processed by the model, whether the content can still arouse the interest of users is an important issue. Also in the case of 360AI search, the news before and after processing is as follows:

Entrance:

Big factories must fight for the place! AI search product 10,000-word long article analysis

After:

Big factories must fight for the place! AI search product 10,000-word long article analysis

Before:

https://www.thepaper.cn/newsDetail_forward_27930855 (the original text is longer, interested friends use this link to see, you can also directly see the following conclusion)

In terms of the author's subjective feelings about reading the news before and after processing, the experience of reading the original text is better for several reasons:

1) Literary style: Different types of news will definitely use different styles, this kind of news and government affairs news and UC shock department news must have different styles, but the current AI search products obviously have not yet used different styles according to different news types to form the final user-facing content, so the style affects the reading experience to a certain extent.

2) Multimedia information: There are many GIFs in the original text, and there are videos in some links, but the news processed by the model is only text, and the information of various media types has a great impact on reading, and will affect the reading experience in terms of information collaborative reception, emotional transmission, and emotional impact.

Now some AI search products can read a variety of media information, but it is obvious that none of the products can output multi-modal information (to be precise, it is possible but very expensive, and the cost of a 5-second video generated by a video generation model is about 1.5 yuan).

3) Mismatch between information and scene: users may be leisurely scrolling through the web page during their lunch break, but what they see is indeed more structured information that is more inclined to work content, and the emotional projection is misplaced.

Nowadays, all AI search products use structured result presentation as a selling point, but the presentation of information must match the scene, and not all scenarios are aimed at saving time.

It is said that 360AI search has been able to identify 4,000 user intents, which sounds like a lot, but in terms of the number of users it currently covers and the scenarios involved in user use, it still takes time to accumulate intent across the magnitude. After all, reading the news can be subdivided into multiple intentions~

The above is the first aspect of the experience analysis of this small case, the writing is relatively long, and the reader may have forgotten the problem itself, let's pull the topic back, in the face of this problem in the figure below, what is the experience problem in the second aspect?

Big factories must fight for the place! AI search product 10,000-word long article analysis

2. When the input information has no subject, should the model add it independently?

Taking the above example as an example, the user wants to see the news of [salary 4000 monthly work 300 hours], and the result is to help the user calculate "your hourly wage is 13.33 yuan / hour", how can this determine that the subject is the user? This [you] word problem is too big.

This example is only a slightly negative experience, but what if one day the user opens the hot news that "my father died early and I was all raised by my mother"? When the time comes, the negative experience of defaulting to the user will be too serious.

Even if you don't consider such extreme negative events, news such as "rushing into the fire and saving three people in a row" is still relatively common and easy to search.

In the long run, users often see that the subject and event are confused, which will affect the user's trust in the information provided by the entire product, which is too fatal to the search product. (At present, the user's trust in AI search results is also an important issue, but how to build trust will not be expanded here, and I will write it again when I have the opportunity)

(This question actually echoes the first point.)

The above is an experience analysis of this small case of 360AI search, in fact, other AI search products have similar problems, and there is no meaning to belittle 360 here, as far as the author knows, the growth rate and iteration speed of 360AI search are very awesome. The experience is also far superior to other 360 products.....

Because the author is doing user experience work, he is still used to evaluating products from the perspective of experience.

1. Why AI search products have become a consensus

There are so many AI products, why is only AI search done by major manufacturers?

At present, when discussing AI and products, there is a relatively obvious conclusion: AI is more involved in products as new technologies and new capabilities, and the needs of users have not changed substantially, so what we need to consider is how to use new capabilities to solve old needs and bring new experiences.

Nearly a year and a half ago, Ali's former CEO Daniel Zhang proposed that all applications are worth redoing with AI, and the author did not understand the meaning at that time, and now it seems that there are actually needs that remain the same, and what has changed is the meaning of the implementation method.

AI search products can become a consensus, especially the consensus of many large manufacturers, of course, from the market size, the number of user demand scenarios, development potential and other aspects. Only when these indicators are large enough can the big factories can't help but be tempted.

For these indicators, look at Google, Baidu, 360 and other companies The answer is very clear, there is no need to talk about it, the author in this article mainly wants to talk about this problem from the aspect of user experience. The reason for choosing this angle is also as mentioned above, when user needs are basically unchanged, the key game point for AI products to replace traditional products is user experience.

According to the author's understanding, the function is only the result of the code, and the experience is the result of the user, the most intuitive indicator from the user's perspective, and the reason why the user subjectively decides to continue to use or leave.

The following is a comparison of the experience of AI search products and traditional search products:

To compare traditional search with AI search, we first need to go back to the purpose for which users use search products.

When the user uses it, it must come with a problem to be solved (it seems to be nonsense, don't look backwards), take the completion of a product analysis as an example, the complete chain in the traditional search product is probably like this:

Big factories must fight for the place! AI search product 10,000-word long article analysis

Depending on the complexity of the user's intent, it will definitely go through 3-6 links in the above process, as well as in extreme cases, there are no search results, and the problem cannot be solved.

Due to the vigorous development and accumulation of the Internet industry for many years, as well as the large number of users participating in the process of content construction, there are relatively few cases of inconclusion, but in some more vertical fields, it is still a problem that has a greater impact on the experience, for example, authors often search for questions related to the intersection of human factors engineering and interaction design, and many times cannot find answers.

Just as the foundation of user experience is to be able to solve problems, the foundation of the experience of searching for products is to have answers.

Traditional search products, because of their principle is to first include hundreds of billions of web pages, when users search for matching, so they can only solve the problem in this part of the scenario with answers, and can only play a role in the above "demand from generation to solution process" in 2-3 links.

In the face of fruitless problems, in fact, there are many excellent solutions and products. For example, Baidu has made a questioning product, and in the face of problems such as the need for secondary integration of search results and poor answer quality, there are also designs such as the best answer and the number of approvals.

For example, if you search for "2024 legal holiday" in Baidu, the first message on the search results page can directly solve the user's problem:

Big factories must fight for the place! AI search product 10,000-word long article analysis

This solution no longer requires the user to select from the result list and then click to open, but this method requires manual identification of the scene for special processing, and can only directly solve simple requirements. Third, there are natural contradictions with some advertising and commercialization scenarios (such as searching for Youku, maybe the first result "must" is iQiyi), so the overall improvement of user experience is very limited.

The biggest advantage of AI search products is that they cover more links in the complete process of solving a demand, and use AI technology to replace part of the work of the human brain, which is a step forward in the experience of all in one.

The experience advantages of AI search products are listed in detail as follows:

1) Ability to obtain information across media and modality

Getting information across media and modality means more accurate answers, a more comprehensive knowledge base, and better answers in specialized fields.

For example, if I want to know 'is there a seat in XX bookstore', the traditional search engine wants to return to this question, and almost has to rely on human users to participate in the answer, but if the AI search finds a picture of this bookstore, it can use OCR and ASR technology to read the picture to get the information, this picture may have existed in the traditional search period, but the information in the picture could not be used at that time, resulting in this question can only be answered by the user. This is a simple example of how cross-media can get more accurate answers aided by information.

When talking about AI products, the concept of multimodality is often mentioned, and the author finds that many people confuse media types with modal types, so here is a little clear:

Text, images, and videos are different types of media, and the information contained in these media can all be received by humans through visual modalities.

Pictures, sounds, and tastes are different modal types, and the information in them must be received by humans through different modalities such as sight, hearing, and taste.

The strict definition of modality in the human-computer interaction process is: the channel of information transmission.

The significance of the ability of information to be converted between different media types and modal types is reflected in the input and output links of the search process, as well as in the accumulation of total knowledge.

For example, in the above example of obtaining information from pictures to give users answers, the same scenario can also be obtained from audio to provide answers to users, if an AI search product is combined with an audio product such as Ximalaya, you can get information in countless professional fields in Himalaya's huge audio database.

2) Cover a wider range of demand scenarios

This point should be understood in two aspects, the first aspect is that ordinary users can use natural language to describe the complex problems to search, although traditional search products also have advanced mode, but the ease of use is poor, the user reach rate is very low, as shown in the figure below is Baidu's advanced search mode, how many people do you think ordinary users have used?

Big factories must fight for the place! AI search product 10,000-word long article analysis

In the medical products that the author himself has participated in, he has also encountered examples that need to use multi-field conditional judgment + dimensional relationship + logical relationship + multi-level parentheses to describe natural language, and it can only be said that the complexity of the operation is very laborious even for professional interaction designers, let alone ordinary users.

The second aspect is based on the various capabilities of large models, which derives new use scenarios, such as a large number of users in 360AI search who use the product's generation capabilities and rewriting capabilities to solve their own needs. The user's mindset about the search product is changing as the boundaries of the search product capability expand.

3) The link to solve the requirements becomes shorter and the complexity is reduced

AI search products can aggregate, summarize, and present the content of multiple web pages in a structured way, which improves efficiency in these parts of the overall process

Big factories must fight for the place! AI search product 10,000-word long article analysis

In these links, AI has significantly improved the problem of scattered information in traditional search products, which is one of the biggest differences between AI search products and traditional search products at the current stage. At the same time, it also reduces the interference of advertising to a certain extent.

In the future, each company will definitely add advertising to AI search products, the specific time depends on the growth rate of AI search products, compared with the general chatbot, AI search products consume more tokens in the input and output links, and also input a sentence of "product design principles", AI search products need to get the content of multiple web pages to the large model first, and the tokens consumed in this process may be hundreds of times that of the general chatbot. In the face of such costs, commercialization is an inevitable result.

In the process of selecting a website, reading, changing a website, and reading, the addition of AI capabilities significantly reduces the complexity of the entire problem-solving process, and users no longer need to jump, distinguish, and summarize between websites in a zigzag manner. This is especially important when it comes to the experience of mental work, which allows users to focus more on getting things done. When the author himself writes, he often has to check a piece of information, which leads to the problem of interrupting his writing thoughts, because the process of checking the information is tortuous and lengthy, and he has to fight wits and courage with a whole two-screen advertisement.

4) More diverse ways of presenting information

Nowadays, many AI search product results use mind maps to display the content structure, and some also support one-click generation of PPT, which means that the diversification of information presentation means that the [secondary processing] link in the above process provides users with more support. More parts of the process are done on behalf of the user.

In the future, in addition to mind maps and PPTs, commonly used flow charts, topology diagrams, fishbone diagrams, and various data display charts may be matched according to the judgments made in the intent recognition process, or different diagrams may be generated based on the current results.

In the 360 case at the beginning of this article, it was also mentioned that the current answer is basically text-only, and many pictures and GIFs in the original text links have disappeared, which greatly affects the reading experience.

5) A more user-friendly ad experience

The interface style around the advertising display location of traditional search products is determined by the webmaster of the third-party website, so there may be a big difference between the ad style and the content style, and the ad can obviously be distinguished, abrupt and blunt.

As shown in the figure below, the advertisement in the WeChat public account article (the style is uncontrollable), and the official advertisement of Zhihu (the style is controllable), you can feel the gap in experience by looking at it.

Big factories must fight for the place! AI search product 10,000-word long article analysis

The content of the result page of the AI search product is generated by its own large model, and the content style is also fully controllable and customizable, so it can unify the visual style with the advertisement, bring better conversion effect and reduce the interference to users.

6) More accurate results and higher quality content

This can be understood in conjunction with point 1), because most of the results of traditional search products are provided by third-party websites, so the accuracy of the content cannot be controlled, and the quality of the content cannot be controlled.

The principle of AI search products determines that the results are more accurate, after the user enters the search term, the model will first rewrite the problem, for example, searching for "the strongest mobile phone with the strongest performance under 2000 yuan under 2000 yuan" may be rewritten as "as of July 2024, the strongest mobile phone under 2000 yuan sold in China", and the system completes the part of the information that the user subconsciously knows but does not write. The more accurate the problem description, the more accurate the result is.

The second reason is that the results of AI search products at this stage are not from a single source, but are generally formed by aggregation, comparison, and summary of multiple content sources, and when selecting content sources, more reliable sources may be selected according to the type of problem, such as news information can come from official media, and code questions can come from CSDN. (People will really subconsciously omit information that both parties know by default, such as the words "second point" in this text, in fact, I didn't say "first point", but it doesn't affect everyone's understanding~)

At the same time, based on the consideration of saving tokens and feedback speed, all retrieved results (such as 10,000 articles) will not be passed to the model for processing, but several of them (such as 10 articles) will be selected as the source information, so when selecting 10 articles from these 1000, it may be selected according to indicators such as source website, reading volume, authorship, interaction, and relevance.

In the end, from 10,000 articles, the articles with higher reading, more approval, and from a few well-known professionals were selected and passed on to the large model, so the results of AI search products may be more accurate and the content quality may be higher during the screening process.

The above 6 points are the advantages of AI search products in terms of user experience, and let's continue to talk about the future evolution direction of AI search products guessed by the author.

2. The evolution direction of AI search products

In order to get a more reliable conclusion, we still start from the search process, and the traditional process of searching for products can be simply described as:

Big factories must fight for the place! AI search product 10,000-word long article analysis

After combining AI, you can do the following things in each link:

1) Input stage: Expand the input method

At present, traditional search engines basically support text search and voice search, and a few non-wide-area search products also support image search.

It should be noted here that the author feels that the problem of simply using speech-to-text input cannot be defined as voice search, which only changes the input form of text, but does not change the total amount of information, and the non-text information in the speech modal information is not integrated into the query.

In the future, it will improve the usability of existing search methods, such as improving accuracy and reducing time. At the same time, based on the user's input information, it is used to complete, correct errors and recommend problems (the problem recommendation should already have a product online)

These search methods may be expanded to support more media types, such as GIFs and videos, and the information in them may be read to form queries.

But that's not cool! The process of human-computer interaction is also very limited, if you want to go further, you need to break the inherent thinking, why does the search need to enter information with the user subject?

The completion of the input process can be changed from manual to automatic, or the input process may be transparent throughout the search process.

Think about it, when we read a "super yellow AI article", if we combine the historical reading information of the specific user, the current reading progress, the length of stay in a certain paragraph, and other information, it is very likely to judge that the user is a little vague about the meaning of a noun in this paragraph, and at this time, if the meaning of this noun is directly displayed. This makes the input phase transparent (automated).

Of course, products at this stage may not appear in a short period of time, and they still need to be judged in combination with some simple behaviors of users, such as using word search as a transitional solution in bean bags.

In human-computer interaction, the intention can generally be judged by behavior, and the automated input process needs to obtain a larger amount of information about the user's environment, such as what information is contained in the interface seen by the user, what information is contained in the environment in which the user is located, and at the same time, it is possible to judge the question that the user wants to ask by combining a large amount of historical data and current feature data.

To give an example in life, a 5-year-old child reading a text, encountered a [貔] word, and at the same time the child's voice stopped, then a smart textbook product is very likely to tell the child directly after learning the reading progress, rare word library, sound wave disappearance and other information: this word pronounces pi, instead of the need for the child to take the initiative to ask. This makes the search process transparent (or automated/passive).

In the specific application of AI technology, intent recognition is a very important part of the ability, and the premise of the accuracy of intent recognition is multimodal interaction, to be precise, human-to-machine multimodal information input in the process of human-computer interaction.

When the sources and modalities of information increase, the total amount will increase, and the accuracy of solving problems (the accuracy of the product in judging the user's intent) will definitely increase when the known conditions increase.

Just like in a person-to-person conversation, language only accounts for about half of the total amount of information. Multimodal interaction solves the problem of the part of information that cannot be received by the machine before, and can improve the accuracy of intent recognition from the information source. (This is the basis for AI search for other services in the follow-up direct link)

However, this only limits the [modality] to the range of [human information channel type], for the machine, it may not be [multi-modal] but [super-modal], people only have five sensory modes, but after the machine is installed with a sensor, it can have more information channels such as gyroscope, GPS, infrared signals, electromagnetic waves and sound waves that cannot be perceived by humans......

Therefore, from the bottom level, the number of information channels of the machine can far exceed that of people, so after solving the computing power and algorithm of the middle layer, the accuracy of intention recognition is very likely to reach the level of people, and the next stage of variable intention recognition is the intention prediction we just mentioned (the smart textbook predicts that children will not be able to read the word 貔 after learning multiple information).

The significance of intent prediction is very important, as it can turn feedback into active service. This is an important change in the human-computer interaction process. As an interaction designer here, I sincerely like the human-computer interaction content in the Honor mobile phone press conference, which is very cool!

A little off-topic, pull back: the above short paragraph is the author's guess on the future evolution direction of AI search products. In addition to this, there may be more meaning in terms of emotional understanding and cross-multilingualism, which will not be expanded on here. Let's move on to the query phase.

2) Inquiry stage: combined with other information

At present, after the user inputs the user, the current AI search will generally rewrite the question to make it more accurate or cover more information that the user may need, such as rewriting "RAG" to "what does RAG mean", or even rewriting it to "the specific meaning of RAG in AI search products".

In this way, the amount of input information can be further increased by rewriting the query stage, and more accurate information can be found.

The author doesn't know much about the technical issues involved, but based on the principle that "the clearer the problem description, the more accurate the answer", the author guesses that the further way to rewrite is to incorporate more information, rather than just modify and extend the information entered by the user in this use.

Fusion of more information refers to the fusion of the user's personal information, past queries, copying behavior, writing data and many other aspects of historical behavior data, and then with the information entered by the user to make a fusion judgment to obtain the result.

In fact, at this stage, the recommendation algorithms of some content platforms and e-commerce websites are already very accurate, and often the content/products we need will be actively recommended, which is because these platforms have a large number of user data.

However, AI search products may not have as many types and total amounts of user data as e-commerce products. Therefore, in order to improve the accuracy of search, the author speculates that in the future, various large manufacturers may strive to achieve data exchange, but only in terms of the current search product business model, each large company does not have enough interests to drive this goal.

Therefore, the author's point of view is that the business model and data accumulation/interoperability of AI search products may be synergistically promoted, and if the answers of AI search products can recommend more accurate paid services/products for users, bringing higher conversions and revenue to advertisers, then the current data holders may provide their data to AI search products. (Of course, it can also be the data master to expand the business and make a similar product by himself)

There are many details that need to be negotiated about the specific use process, such as whether the data is directly visible or only provides characteristics.

3) Output stage: Expand the output mode

The output method also includes different modalities, media, forms and file types, and at present, each product supports brain map and PPT, and should support flow chart and fishbone diagram in the future....to cover more user needs.

Big factories must fight for the place! AI search product 10,000-word long article analysis

User voice

At the same time, it is also very important to refine the improved forms that have been supported, for example, at present, only the brain map in the answer is only supported to be downloaded as a picture, which can not meet the needs of users to edit and modify, and it is also very meaningful if it can generate xmind source files or support modification of the brain map in the web page.

Including the typesetting form of PPT、Exquisiteness In fact, the current AI search products are still relatively weak,If compared with products like Gamma,It is rubbed on the ground,Even compared with the domestic bigPPT,There is also a big gap。

In the face of creative scenes, it is also a great ability to generate content-related images, how to divide the content of very long answers, extract keywords related to images, and ensure that the visual style of the pictures in the whole text is consistent.

The above is that the output stage supports different media forms, and the following is the output of different modes:

Output in text form and output in sound form can meet different use cases, for example, when the distance of the user's device is slightly larger, the visual modality cannot help the user to receive information effectively.

In multi-task scenarios, different modes of information receiving channels can also be used to obtain a better collaborative experience, so that users can focus more on the main task.

For example, many people now have the need to use dual monitors, mainly to solve the problem of multi-task collaboration, this method is more suitable for scenarios that require more time to understand the content than using sound channels for multi-task collaboration, if you just want to obtain a simple data, you can use voice control "Xiao Ai, help me check Baidu's 2024 revenue data" and receive it in the form of sound and write it directly into the article, avoiding the sense of separation caused by multi-interface switching.

Further, the output stage also needs to take into account the user's needs for information storage and sharing, and can even do multi-content association to assist users in later searching.

According to the author's understanding, it is best to associate the storage of information with the note-taking product, and it is best to seamlessly import the notes and relate them to related topics. The easiest way to do this is to extract the same keywords to form tags, and you can filter the content by tags.

For the demand for sharing, it is necessary to consider the sharing channel, the refinement of the sharing layout, and the addition of the information required by the user (such as adding the author's ID, self-media name and even contact information, etc.) when sharing, so as to reduce the secondary processing of users.

4) Browse the results stage: a thousand ideas

This stage is the focus of each AI search product, which mainly uses the summary ability of large models and the ability of Wensheng diagram to bring users a more aggregate, clearer and structured result browsing experience.

But there is also a problem, the above *· 37- mentioned reading entertainment news scene is not suitable for using structured, summarized information for display.

Therefore, the authors speculate that when the future AI search model can identify more diverse and detailed user scenarios and intentions, it will make corresponding interface styles according to different scenarios and intentions in the interface presentation.

At present, the structured result display method is only suitable for the subdivision scenarios of professional knowledge reading in the reading scene, and the number of scenarios covered by the search product is too large. Watching dramas、Download files、Addressing and other scenarios require more detailed and personalized interface design,Even such scenarios as addressing do not require interface design,When the accuracy of the judgment of addressing intent is high enough,You can directly open the website when the user searches for [Youku].

From this point of view, the face of a thousand ideas can be completely not limited to the style of the page, including the overall process can be distinguished according to the intent. Combined with the other conjectures mentioned above, the process of the possible search will be unrecognizable:

Big factories must fight for the place! AI search product 10,000-word long article analysis

If you don't think about it from a business perspective, you can also personalize the fonts, colors, layouts, and many other visual styles in the page design according to the user's aesthetics. It can also be part of the experience enhancement, but pay attention to the balance between visual unity, branding, and personalization.

5) Reuse and communityization of results

At present, the cost of AI search products is still high, according to the blog of Mr. Liang, VP in charge of AI business of Super Huang and 360, the data disclosed is about 0.2 yuan per search.

The rough composition of this cost is regarded as the tokens consumed in the input and output phases, so the same answer can be used for similar similarity problems that reach a certain standard. This can reduce the cost of token consumption in the output phase.

For similar but substandard questions, the previously generated answers can still be used as information sources to participate in the generation of new questions, and the results generated by the previous question are equivalent to refining multiple pieces of content to have a higher degree of matching with this question, which may also save some token consumption.

If users can be guided to use their manually modified information as public content and allow it to be accessed by other users, then the content can be precipitated to form a content community, and finally integrate the content community product with the AI search product.

Most of the traditional sources of search product results are third-party websites, so although search products are an important traffic entrance for the entire Internet, they can only do the business of selling traffic and advertising. The reason for this is that because the content is not its own, the business link to the search results stops.

If AI search products can complete the precipitation of content and form content communities like Xiaohongshu and Zhihu, it will be a huge improvement to the product ceiling.

To put it simply, search products are generally used only when there is a demand, while content community products are something that can be browsed when there is nothing to do.

For example, users like to watch cold jokes, and the regular path at this time is generally to follow the bad joke blogger in a content community, rather than searching for "bad jokes" in Baidu.

The second means user retention. Content itself is a consumer product, and it can connect between KOLs and consumers, both of which are the key to retention. In fact, there is no need to say much about the retention capacity of content consumption products, just think about Douyin and Xiaohongshu~

Some traditional note-taking products are also based on this idea, hoping to publicly display the high-quality notes created by users after authorization, form a content community based on a large number of users, break through the ceiling for note-taking products, and change tool-based products into community-based products. For example, Evernote has a separate [Zhitang] product that originates from the content section of the notes product.

For AI search products, because the content created is simpler, and the content quality in the content comprehensiveness and public field can also reach a certain standard, it is more likely to develop according to this idea, and the most important thing is to precipitate the content produced by their own computing power to generate the value of being consumed 2-N times.

Looking at the whole article, in fact, you will find that the value of data is reflected in all aspects of AI search products, the input link can be combined with the user's personalized data to rewrite the question more clearly and accurately, the matching link can find more information sources, the output link determines the accuracy of the answer and the quality of the content, and the post-search service link can even have the opportunity to break through the ceiling of the search product.

It can be seen that data is the second most important competitive barrier for AI search products (and even all AI products), and on the other hand, there is no doubt that the model capability.

After the formation of a product in the form of community/search integration, the more important significance is to achieve more revenue composition in the business aspect, for the business that is not involved, you can still sell traffic like traditional search products to monetize, and for your own business, you can completely sell traffic to sell products and get more profits.

From this point of view, the larger the AI search product, the more important it will be for the company with more business involved. Coupled with the attributes of the new generation of traffic entrances, the author believes that AI search products are a must for large manufacturers.

6) From searching for products to all products

The above 5 points we discussed are basically speculations on the evolution of AI search as an independent search product, but the capabilities of search + AI can actually be reflected in any product that requires a search function.

For example, note products, where users have accumulated 10 years of note content are difficult to find and associate, and AI capabilities are also very needed in terms of related content aggregation. If AI capabilities are added, more accurate searches, fuzzy searches, Q&A based on note content, and so on can be realized.

Similarly, the search process of e-commerce products can be based on the rewriting of search keywords to achieve more accurate product matching, which can play a role in enterprise-level knowledge management products for companies, and can quickly search scientific research literature for specific industries.

Therefore, the authors believe that generalized AI search products may not be independent products, but play a role in the search scenarios of many types of products.

The essence of search is people's information needs, and the future form of AI search will be divided into two main scenarios:

One is based on the matching of existing directly available information, and the other is based on the aggregation + generation of non-directly available content.

3. The core experience of AI search products

After so much nagging, in fact, the core experience of AI search products is already very clear, in the order of user paths:

Input experience, feedback speed, result quality, receiving experience, post-search service, the following are the following detailed factors that affect them:

1) Input experience

Input experience, first of all, refers to the media/file types that support input, text, pictures, audio, video, GIFs, documents, links....The more types supported, the higher the user's freedom of operation, the more scenarios that can be covered, and it can also reduce the cost of manual conversion of formats caused by input restrictions, so the more media/file types that support input, the better the experience.

The second aspect of the input experience is the ability to understand non-text information, for example, when users use voice to search, whether they can get more information from other aspects such as speech speed, volume, pauses, etc., so that this information can be fused with the information converted from speech to text to form a more accurate input query.

For example, if the user enters "12400f and 12490f" is transcribed as "12400f and 12490f two CPUs, which is better in terms of performance, power consumption, and game experience", it can actually describe the problem more completely and provide more and more accurate answers. The input experience does not refer to the experience of user input, but rather to the overall impact of the process from user input until the query information is entered into the model.

2) Feedback speed

The feedback speed is determined by indicators such as index database, model efficiency, computing power, server performance, network speed, and the amount of data that needs to be fed back to users.

The index database is a database containing product information, and its special data structure can improve query efficiency, so that the query process can find relevant results without scanning the entire data, and can be completed more efficiently for queries with complex conditions. Therefore, the more efficient the indexing library, the shorter the feedback time.

The efficiency of the model determines the generation speed of the result memory in the output link, and the speed of generating content in different models may be significantly different, so the model efficiency also affects the feedback speed. The faster the feedback, the faster the user can get the result information and the better the experience.

Computing power (the part available to users) directly affects the generation speed, and there will be obvious differences in the demand for computing power at different times, for example, the demand for working hours must be greater than that of night hours, and at the peak of demand, you can consider combining charging methods to bring a better experience to paying users, or use other user guidance that is beneficial to the enterprise to give users priority to use, kimichat tested the water reward mechanism a few months ago so that paying users can give priority to the use of computing power during peak hours.

For idle computing power, it can also pre-generate content that users may need or some long-tail questions, and display and process them directly when needed to improve the feedback speed.

The same server performance and network speed will actually have a greater impact on the feedback speed, but these two aspects are also applicable to traditional search products, and are not unique to AI search products.

For example, when a user searches for "in which year did Jobs found Apple", the user may only need a specific year information, rather than a lot of long speeches, and it may be meaningless for users to output all the information about Apple and Jobs. In the process of outputting this information, it not only consumes tokens, but also increases the cost and affects the feedback speed.

In some scenarios, there can even be no output information, for example, the ultimate goal of the user in the addressing scenario is to open a website, so opening a website directly without a search results page is an excellent experience with both low cost and short path.

3) Quality of results

The quality of the results is determined by the amount of index database data, information source selection rules, total number of information sources, model quality, amount of input query information, and problem understanding accuracy.

The larger the amount of data in the index, the more information the matching process can find to answer the user's question, and the more answers the user needs.

The information source selection rules affect the quality of information used to convey to the model, and in the face of the same problem, if the answers in Baidu Q&A are selected as the information source or Zhihu is used as the information source, the impact on the quality of the results can be imagined.

Of course, the selection of information sources is not a simple choice to obtain information from Baidu or from Zhihu, generally speaking, for problems in the professional field, professional information can be obtained from various vertical websites, and the quality of professional information is better. For common questions, it may be based on the relevance of the content, the number of views, the authorship, the number of content interactions, the time the content was published... The overall principle is to hope to abstractly judge the quality of the content through various direct or indirect indicators, and pass some articles with better content quality to the model for summary and structured processing. Then it is easy to understand that the more reasonable the information source selection rules, the better the quality of the results, and the better the user experience.

The quality of the model plays a role in the transmission of the answer information into the model, and the answers generated by each model may be very different in the face of the same input information, so the higher the quality of the model, the higher the quality of the result, and the better the experience.

At the same time, part of the quality of the model refers to the ability to understand natural language, whether the user can accurately understand the meaning in the face of user input, and understand what the user needs, I have to mention 360AI search again, when I search for "tiger picture", I can't directly locate the picture result, but show me the result page like this:

Big factories must fight for the place! AI search product 10,000-word long article analysis

First of all, the theme of the page actually described two pictures to me with text, and then recommended other relevant information about the tiger to me, and at the same time, the guide in the upper right corner I clicked for more than ten seconds and still couldn't turn it off, and the experience was terrible.

An accurate understanding of the problem affects how the follow-up process is promoted, and as a simple example, when I type in "Youku", should I introduce the information of Youku or should I directly give a jump link?

The larger the amount of information, the better the quality of the results, but there will be a critical value, and the increase of the amount of information will become very limited to the optimization of the quality of the results. The need to define a reasonable value also confirms the importance of data source selection rules.

4) Receive the experience

The receiving experience is determined by indicators such as the type of media/modality/format that can be output, the UI interface, the secondary processing time, and the advertising experience.

The more media types and formats that can be output, the wider the coverage of user needs, the difference between having and not existing, saving the time for users to convert twice, and the impact of this experience does not have to be nonsense.

The output modality is slightly different, for example, in the driving scene, the sound modal output must be more in line with the user's acceptability in the scene. In the office scene, the visual modality is better.

Therefore, the output of different modalities is supported to match the way that users in different scenarios are suitable for receiving information, and the second is that multi-modal collaboration can further improve the efficiency of information transmission.

The efficiency of receiving information in the visual mode can be more than 100 times that of hearing, but the auditory mode has the characteristics of passivity, attention sensitivity, and surrounding.

Passivity means that information can be passively received by people, which is less likely to be missed than visual information, attention sensitivity means that changes in sound can be perceived by users more quickly, and surroundness means that the location of information sources can be initiated by people 360° around people, and can be received by people.

Based on the different characteristics of visual modality and auditory mode, multimodal fusion can take advantage of its own strengths to help users process multiple tasks at the same time and receive information more easily in various scenarios. (Multimodal interaction involves a lot of content, and it may take another 10,000-word long article to fully explain it, so I won't expand on it here)

Expand on the characteristics of the information in different modalities a little above, and move on to the impact of the UI interface on the receiving experience.

The UI interface is the longest-developed and most deeply studied visual channel information transmission method, and the visual channel is the way that more than 90% of human beings receive information, so the UI interface is regarded as one of the factors affecting the receiving experience.

UI design in a broad sense includes typography, text, graphics, animation, interaction mode and its secondary attributes, because the main way for humans to obtain information from the outside world is visual modality, so the UI interface is a very important part of the receiving experience.

The way of typography determines the order in which the user obtains information, visual pressure, the font of the text determines the difficulty of obtaining information (such as cursive and regular script) and the feeling of beauty, the graphic can express the information more intuitively and with emotion, the dynamic effect can guide the user's attention so that the visual focus is always located in the target information, and the interactive mode can allow the user to get the hidden information and multi-link information more naturally.

For example, the impact of token generation speed on the UI interface, many chatbots now generate answers one by one token and display a token on the user interface, which causes a strong dynamic effect, which will seriously interfere with the user's attention and affect the efficiency of information reception.

At present, the price difference of token generation speed is mainly reflected in the vendor pricing stage, the author checked and did not see the vendor who is priced according to the generation speed, from the experience of feedback speed, it must be that the faster the result is generated, the better, but you can slightly control the interval time displayed on the interface.

Generally, the first waiting time will not cause user loss within 2 seconds, and you can consider generating a piece of content and displaying it on the interface at one time to avoid frequent changes in the interface. (Think about the small ads that keep jumping on the spam website, you should be able to feel a similar experience~)

The secondary processing time is affected by factors such as the type of media/modal/format that can be output, and the quality of the results, as mentioned above, and users will inevitably encounter situations where the search results cannot be directly used in other scenarios (such as various reports), and the secondary processing time of the content will greatly affect the experience.

For example, whether the editing of the mind map can be edited online or after downloading, whether the generated picture can be partially modified, etc., the longer the secondary processing takes, the worse the experience.

Ad experience is an unavoidable topic, and AI search products must be commercialized to cover the cost, as mentioned in the previous article, the impact of visual style on ad experience, as shown in the following figure:

In addition to the visual style, it is equally important that the ad content matches the user's attributes, and when the ad content happens to be what the user wants and matches the user's spending power, it can even achieve an overall positive ad experience.

If the overall ecosystem and cooperative advertisers are large enough, the intangible integration of advertising content into answer content will be an important change in the future advertising form.

At this stage, the biggest experience problem of keyword-based advertising is that it is not what users need, that is, what users need does not match what is recommended by the advertisement, which leads to the advertising information affecting users to find and read the correct target information.

If the target of the user's search is "AI course", then even if there is an ad for selling the course, it will not affect the user experience, because this is exactly what the user wants. The experience is better if the quality of the course (the quality of the product corresponding to the advertisement) can be guaranteed. The basis for ensuring the quality of the products corresponding to the advertisement is that the advertisers mentioned above are large enough to have a basis for screening.

5) Search post-service

The experience of a post-search service is determined by indicators such as service scope, search-service integration, service-intent matching, service path length, information memory, and advertising experience.

The scope of service refers to whether you can approach a one-stop solution to your needs after searching for relevant information, such as searching for Beijing travel guides, and continuing to book air tickets/hotels/tour groups to Beijing. Search for iphone15 and you can place an order in the results page immediately.

This experience is related to the topics mentioned above, such as data exchange and AI search product integration, and it is obvious that the larger the scope of services that can be provided by the AI search service, the shorter the path, the simpler the operation, and the better the experience.

In the process of post-search service, the traditional way is to switch between the platforms of various large manufacturers, the user path is long, and many information such as accounts/passwords/addresses need to be entered in different products many times, which is very complex and has the risk of fraud.

If the AI search product can be integrated with other businesses, it can be closer to a one-stop complete solution to the demand, rather than breaking down the demand into multiple products of multiple companies. For example, in the travel scene, you can solve the search strategy and the needs of the plane/wine/group at one time. That's what search-service convergence is all about.

When a large manufacturer covers a wide range of businesses, whether it can accurately correspond to the intent of user search has become an important factor affecting business conversion and experience.

The service path length is also given as an example above, when the user's purpose is to open a website, then opening a website without a search results page is an excellent experience with a low cost and short path. There is no need to give the user a website entry at the top of the results page, and you need to click on it again. However, this specific scenario may affect the amount of ad impression, and the actual environment needs to be carefully considered.

The ad experience has also been mentioned above, so I won't talk about it anymore.

—-Summarizing the Dividing Line—-

The content of this article is composed of 3 main themes:

In fact, most chatbots and other AI products have appeared for a very short time, and many user experience problems have not been done in detail, and most companies are still concerned about technical problems at the model level.

However, the authors have always believed that from the user's perspective, they do not actually care about the technical issues at the model level, and what is more directly in contact with the user is the experience, which is the decisive factor for the user to decide whether to continue to use the product in a very short period of time and subjectively.

Therefore, the author pays more attention to the problems of AI product experience, and will output more cases of AI product experience to share with you in the future

This article references:

1. Super Brother Huang's article "Double 1 Billion: AI Reshapes Search| Understanding the Current Situation and Future of AI Search》

2. The article "AI Search, One Thorough"

Columnist

Du Zhao, WeChat public account: AI and user experience, everyone is a product manager columnist, practical designer, currently in a mobile phone company responsible for mobile phone OS interaction design, responsible for the product covering hundreds of millions of users, mainly research the integration of AI and human-computer interaction design and the impact of human factors on user experience.

This article was originally published by Everyone is a Product Manager and is prohibited from reprinting without permission.

Image from Pixabay and based on the CC0 license.

The views in this article only represent the author's own, everyone is a product manager, and the platform only provides information storage space services.

Read on