



本文中,我们从对话是协同行动(coordinating joint action)这个视角来理解它。对话是动态的,充满了信号和互动。我们可以按照自己的设想开始一段对话,但是很多时候不能保证对话在哪里结束。chatbot和对话是名词,但是要很好的理解它们,我们倾向于把它们想象成动词。我们如何和别人互动?我们怎么确定对话按照我们想要的方向发展?


对话的很大一部分构成是场合。设想如果你在舞会上,想请人跳舞,你可能只需要走过去,点点头说一句:我可以么(May I)?你的舞伴就会明白你的意图是想邀请她一起跳舞。但是设想如果你是在大街上这样问一个人,她可能就会很困惑,不知道你的请求是什么,或者只是理解为一次善意的打招呼.这就是不符合场合的对话。


1. 相互问候:这部分很容易理解,但是当你说“早上好”,对方说“吃了么”的时候表达的都是问好,问候有非常多的表达方式,但是目标是一样的:建立良好关系

2. 信息交换: 当你说”你晚上打算做什么“你期望对方给你相应的答案。对话是关于提问和给出答案的过程

3. 鼓动行为:当你说”我们明天一起去逛街吧“或者”你可以帮我拿一下电脑么“时,对话的一部分是关于制定计划和提出请求做某事

4. 确定观点:当你说:”我同意,葡萄比苹果好吃“时,你在确定自己的观点







斯坦福大学的教授Chris Pott曾经有个准则:常规的事件可以用常规的语言解决,非常规事件需要非常规语言来解决。chatbot处理常规事件一般没有问题,是那些非常规的事件它解决起来有困难






确立了目标之后,你需要考虑我们可以从日常的互动中学习什么经验。其中一个“骗局”就是,其实在CHATBOT背后有很多工程师前提编程好的答案,比如当你对SIRI说:给我将一个笑话时。SIRI并不是真的当场“想“出了一个上周末听到的笑话。实际是,SIRI后台在咨询查询表格(consult a lookup table),苹果的工程师提前设想到我们会问这个问题所以把这个问题编写在SIRI知识库里面。对于大多数的公司,这个方法是可用的。记得我们上文提到的”普通事件可以用普通语言回答么“,你可以对某个应用场景中经常出现的问题,编写相应的回答。这样可行是因为我们预先就可以猜测客户会和我们如何互动,或者我们知道客户的常规的行为方式。通常,你的客服人员会知道客户经常会问哪些问题,或者通过你的选票系统或者其他的大数据分析,你可以知道你的客户经常会问的问题和对话的方式,所以我们只需要尽可能的把把所有的情况都编码进去就可以了是么?实际上,CHATBOT没有那么简单,我们上面谈论的只是一个信息检索系统,或者搜索系统,但是成功的CHATBOT不是搜索栏。它需要有互动,需要有对话,需要协同行动。下面我们想要介绍4重算法训练,你通常需要从你的数据库或者数据服务提供商哪里获取训练数据,这些数据用来使你的CHATBOT里面对话和客户进行互动,分别是:




    实体提取:”我特别想吃苹果"和“这个苹果特别好吃”是不一样的意思。实体提取对于算法理解语言的细微差别非常有帮      助。


1: Utterance, or, How Many Ways Can YouOrder a Pizza?

To work at all, your chatbot needs to understand what users are asking it to do. And while you can likely easily identify the most frequent,most normal requests from a user, it's tough to come up with every permutation of those core questions on your own.That's what utterance data collection is all about.The task is simple: set up a task where a bunch of people come up with different ways to ask the same question. What's the question? That'sup to you and your team. But you'd be surprised just how many ways there are to ask for the simplest things.


• Transforming FAQ content into a chatbot(you’ve already written answers, but want tomatch them to lots of different questions)

• Building up voice/text activation for a new feature (how many ways are there to ask fora song to play?)

An example? Reddit's Random Acts of Pizza,where people ask for pizzas and potentially the community responds. If we look at 5,671 requests for pizza, we’ll find that 99.4% of all of them have unique titles. In fact, there were only four repeats at all! Inside the body of the posts themselves, the only repetition that exists over 27,000 sentences are basically just greetings and assorted gratitude:


This a good example of the breadth of just simple requests. “Please pass the salt” and “Salt!” are both ways to make a request, after all, but they feel rather different. And while people will interact with chatbots differently than people(think about how you search for shoes or use Google; it's not exactly how you talk to your friends), accruing a database of the ways people ask for things gives your chatbot fuel to answer those requests in kind.

Now, a section or two ago, we mentioned that we're going to use this eBook to demonstrate how to create the data you need to train a chatbot. We chose to create data around an airline customer service chatbot, but of course,you can do utterance tasks for whatever utterances you want to capture.For our example job, we chose to ask for ways to ask for "can I change my flight?" Again, there are no specifics here (like "I need to change flight 563" or "I have to fly to Vegas instead") so the pool of utterance data is artificially limited a bit,but here's how you do it:


Pretty easy right? Now, one of the things we prides itself on is quality control.But with utterance tasks, that can be tough. You can't come up with the "correct" ways to ask this question (in fact, you're trying to accrue just that data) so you can't use the typical test question format most of jobs take. We get around this ina pretty simple way: two different, intertwined jobs.

Next, let’s look at relevance:

2: Relevance, or, Are We Making Senseor Not?

Once you have a set of utterances, you want tobe able to match them with answers and actions.Relevance tasks do this by giving you trainingdata about you can use to map utterances thatusers might say to the help pages and actiontriggersin your database. They are usually of theform, “here’s a question, here’s an answer, howrelevant is it?”In doing this mapping, you are likely to find thatcertain flavors of questions need longer orshorter responses. The more a response justlooks like “the best matching paragraphs” or "anadjacent answer from our FAQ section," the lessdirect help it offers, the less human it feels, andthe less satisfied your user is.To get a sense of how people know what tosay, let’s look at the four maxims Paul Gricedeveloped that people follow when talking. Ifyou flout these maxims, things get weird.

1. Quantity: be as informative as you possiblycan and give as much information as is needed,and no more

2. Quality: try to be truthful and don’t giveinformation that is false or that is not supportedby evidence

3. Relation: try to be relevant and say things thatare pertinent to the discussion

4. Manner: try to be as clear, as brief, and asorderly as you can in what you say and avoidobscurity and ambiguity

We can reduce these even more. For DanSperber and Deirdre Wilson, the centralthing is “Be relevant”. Or more formally:The issue for chatbots is they can havetrouble understanding context. They'recertainly worse at it than we are. And because of that, some of their responsesare, well, irrelevant. And irrelevantresponses make for bad conversations.They don't coordinate joint action.

This is one of the reasonsit's much simpler to createa chatbot to handle discreteissues (like rescheduling aflight) than one that justwants to talk about any old thing

You see similar tasks in search relevance projects:given a query, does this resultmatch? Is it relevant? Doingthat with chatbot question/answer pairings gives youthe tools you need to tweakyour models and make themmore accurate. It also willshow you where your modelis falling down and where it's succeeding.

3: Intent, or, What Were You Trying to Do Anyway?

When we’re engaging with people in jointactivities like conversation, we are (orbecome) attuned to their intentions. That’swhat’s behind the comedy of somethinglike Lucy and Charlie Brown’s “I knowyou know I know you know” chains ofreasoning. Other minds aren’t entirely opaque to us, even if we tend to fill them inwith our own projections.

Much like the last example, you see intentwork in informational retrieval projects likeinternal search relevance tasks. Basically:does this output match the intent of whatsomeone wanted? When someone searchesfor an iPhone and they're presented with aniPhone case, does that match their intent?The same is true for chatbot replies. Givena question from your utterance corpus,how relevant is the answer your model orhardcoded bot returns?

The reality is that relevance isn't quiteenough for chatbots. Conversationis simply too complicated for simplerelevance to make chatbot responsesgood enough.

Take the airline customer chatbot we'rebuilding. Imagine a customer typing"baggage fees?" What do they actuallymean? Are they asking what the baggagefees for a particular flight are? Are theydemanding a refund for baggage fees theywere recently charged? A chatbot whodoesn't understand context and intentmight just send the customer to an FAQabout baggage fees. And that customerisn't going to be particularly enthusedabout that interaction.

Intent and relevance are intrinsicallylinked. You want to start the process byidentifying which flags your chatbot willbe able to support. Do you want to handleyour top ten issues? Top five? You wantto tackle as many permutations of thoseconversations as possible in your relevanceand intent tasks. And keep in mind,these tasks are sometimes even morevaluable for tuning your bot after it’s beenreleased or with test conversations youconduct with it. You'll be able to analyzewhole conversations, find out wherethey fall down, and give annotators fullerconversations to understand customerintent.

Because, really, that's an important pointhere: intent shows itself most clearlyin the context of a full conversation.That "baggage fees?" comment means amuch different thing based on particular,individual conversations.


Intent tasks often present annotators withconversations (or snippets thereof) and askusers if the chatbot is understanding theintent of the customer. In the places it didnot, it's important to understand whereand why your bot hit a snag. Once that'sunderstood, you can hone your models orhard-code answers to deal with preciselythose issues.

Last thing: remember that point we madeabout your chatbot's personality? That playshere. If your chatbot isn't sure it's going tobe relevant (essentially, it's unconfidentabout output) or is at sea over intent, justask! Chatbots that deal with requests byasking a series of probing questions to findthe exact thing that user is looking to doare far, far more successful that those thatmake pseudo-guesses where they're notfully confident. When in doubt, your chatbotshould aim towards further clarity, notaction.

4: Entity Recognition, or, WhichWashington is this Washington?

Entity recognition is the last major trainingjob for your algorithm. Essentially, itinvolves looking at passages of texts andidentifying "entities" within. Those mightbe places, people, product names, youname it, but generally work best lookingfor specific entities that are valid for yourparticular use case.

Take our example use case of an airlineservice chatbot. If you tell it that you'relooking to go to Washington, what doesthat mean? Because it could mean any ofthe following:


You get the idea. Now, if you're buildinga chatbot that's looking to engage overAmerican history, Washington has atotally different meaning. Ditto to a botlooking to give out college sports scores.The list goes on.

For starters, this is why more generic,multi-purpose bots are so difficult andwhy context is so important for anychatbot. But it's also why you need towork on entity extraction for your chatbotproject. In fact, named entity recognitionis one of the basic building blocks ofnatural language processing and it allowsyour bot to function properly

We've created an entity extractiontool that's very similar to a popular oneyou may have heard of called BRAT.Essentially, on our platform, you provideusers with text blocks and they highlightthe entities you care about. You can seean example below:


In that screenshot, we're interested in afew salient things to build to our airlinechatbot. Note especially that numbers areimportant here. Is it a flight number? Anarrival time? An amount of ounces for carryonsunscreen? The more examples of namedentities your model sees, the more it learnsto understand that some time people typingwon't write "7:25" and instead just write"725" but your bot will actually understand.That increases your bot's accuracy, itsability to actually converse, and, yes, makesit function in the way it's supposed to:coordinating joint action.


Nice as it would be, you can't just buy chatbot software out of a box and simply deployit. You need to test, tune, and train your chatbot. Hopefully, this eBook gave you theunderstanding of how that's actually done. But we do want to highlight a few of the keytakeaways we'd love to leave you with now that we're finished:

• Conversations are about coordinating joint action. The best chatbots have realconversations and, thus, coordinate realjoint actions

.• When in doubt, make sure your chatbotis curious. A curious chatbot understandswhat a user really wants before acting. Andpeople are much more willing to answera few extra questions than deal with badoutcomes.

• There are four major chatbot dataprojects. Each are important.

 They are: • Utterance tasks: How many ways arethere to say a thing?

              • Relevance tasks: Does this responseeven make sense?

          • Intent tasks: What did the user want tohappen here?

           • Entity extraction: What are theseparticular words exactly
