From RAG to Greater AI Richness
A look at retrieval augmented generation and its role in AI's ongoing evolution, with Miho Ezawa
This transcript is edited for clarity.
Charlie Guarino: Hi everybody. This is Charlie Guarino. Welcome back to another edition of TechTalk SMB. Today’s podcast is a very unique opportunity for me because we’re trying a format that we’ve never done before, and what I’m talking about is we have two people as guests on the podcast. I’m very happy to first announce that I have Miho Ezawa, and Miho is located in Tokyo and she is recognized as an AI expert. I first met Miho last year at an IBM Champions event and it became very obvious to me that she really embodies what a Champion does in this area. A quick bio about her: Today Miho works for CRESCO, where she works in corporate sales focusing on innovative technologies like AI and communication robots. Currently she leads the AI and data technology department at CRESCO—also located in Tokyo—guiding their advancements in AI and data tech while also consulting on AI implementation. She has been an IBM Champion since 2019 and won the Idea Award at the second IBM Watson Japanese Hackathon in 2016. Very impressive and really so happy to be with you here today, Miho. Thank you very much for joining.
Miho Ezawa: I’ve been looking forward to the podcast.
Charlie: Great. Thank you, Miho. I should point out that to facilitate this conversation, we decided that Miho will speak her native Japanese and we brought in a second guest, Kosuke Kakizawa, also located in Tokyo, Japan. I met him while at a conference in Europe a couple of years ago and he has given me the unique opportunity to speak at one of his company’s events last year in Tokyo. Kosuke is the president of SANWA [Technologies], located in Japan in Tokyo. His company specializes in digital transformation, modernization, cybersecurity on IBM i and other technologies, and his company also partners with several overseas businesses. Kosuke, it’s a real pleasure to see you again. Thank you so much for joining me here today.
Kosuke Kakizawa: Yes, thank you very much for having me. Happy to help.
Charlie: Great. Thank you and as I said, Kosuke is going to be playing the role of interpreter for us, so thank you for doing this. This is such a unique thing that we’re trying here today. When Miho and I last spoke, one of the things that became very obvious—and if you do any kind of research at all in AI or about AI, you quickly learn or quickly realize that one of the more important, newer components of AI is RAG. It’s an acronym also known as retrieval augmented generation, and the more you research AI, the more you’ll keep coming into this acronym, RAG. So I’m going to start with that very question, Miho—and again, Miho will respond in Japanese. Miho just tell us what is RAG, retrieval augmented generation? What is it and why should people be concerned about it?
[CONVERSATION BETWEEN MIHO AND KOSUKE]
Kosuke: So when the generative AI gives an answer, the AI will generate the answer based on generally known knowledge, generally known information. So the answer wouldn’t actually go outside of what has been pre-trained when this model [was] created. By having this RAG, retrieval augmented generation, the answers can be based on updated information. So for instance, whenever the organization has this generative AI, all the answers can be based on updated, latest information which may reside only in that particular organization. So in order to give this updated or precise, more accurate answer, in order to avoid the hallucination—you know that is one the most critical issues which the generative AI has right now—the RAG can actually solve this problem of not having this hallucination and give a more accurate answer.
Charlie: So what did we have before we had R-A-G or RAG? What did we have? If there are companies who are relying on the result sets of AI and they’re not using RAG, does that suggest then that what they get back from the LLMs, the large language models, may not entirely be accurate? And what does that do to them putting full confidence in the resulting data that’s returned from AI?
[CONVERSATION BETWEEN MIHO AND KOSUKE]
Kosuke: So it’s not about having RAG or not having RAG. There are particular areas that generative AI without RAG is actually good at. For instance if this person would like to have this planning, retrieving some ideas from generative AI, then this AI is very good at, right? For instance this person who [would like to] generate the comments to their colleague’s report, generative AI is good at it. But in a case where these employees wanted to search for their procedure steps of how to apply this expenditure within the organization, then if this person goes into this AI, this model, and asks for this procedure, without RAG, this AI generates the answer which is not based on the fact. It lacks the accuracy. It sounds like it’s the correct answer, but it is not based on the fact. So that’s the difference that we have between RAG and without RAG. By having RAG, there was certain areas that AI was strong, but at the same time without RAG, the generative AI wasn’t good at it. Having the RAG, generative AI can be good at both ways. So yeah, that’s the difference.
Charlie: Just so I am clear, it sounds like that the payload that is returned from AI is then interrogated against known, credible sources—that’s the definition I keep hearing about. It’s validated against known, credible sources to ensure the data is not hallucinated, is in fact correct.
[CONVERSATION BETWEEN MIHO AND KOSUKE]
Kosuke: Yes Charlie, you are right. So yeah, by using RAG the data, the information will be interrogated, but there’s a problem within it. This problem is that when this RAG retrieves the data within the organization, that data might be outdated. So if the RAG retrieves the data from this old data source, then this output that the RAG will generate could be outdated as well. So that’s the kind of issue that this RAG or this generative AI is having as a major issue right now.
Charlie: Would that then suggest that if the large language model was better trained, or trained on more recent data, that we then would not need RAG to assist us?
[CONVERSATION BETWEEN MIHO AND KOSUKE]
Kosuke: So it’s a matter of a trade-off in between having this LLM trained with the updated and the latest information, or having this RAG. So from Miho’s standpoint, she thinks that it’s not sufficient to have this continuous training on the LLM. It concerns about the cost of the training, and also there’s a report that indicates that having RAG as the method to give an answer based on the latest retrieved information is more sufficient than retraining these LLMs.
Charlie: Okay. In some of the research I’ve done using RAG, I see one popular use—or one very good use case, I should say—is in chatbot, where users can communicate with a chatbot and get very meaningful responses. Is that a very perfect example of using RAG, or are there other applications that are even better suited for RAG?
[CONVERSATION BETWEEN MIHO AND KOSUKE]
Kosuke: Yes, so for instance in a way to create FAQ, frequently asked questions, RAG has a certain use of creating FAQs right now. For instance, in a way to create this FAQ for a certain specialist, or an FAQ in order to satisfy general customers, by using RAG, the generative AI can actually generate the needed FAQ as it is wanted.
Charlie: I have also read that large companies that we recognize—such as Salesforce and an airline like Jet Blue—are now implementing RAG in their technologies. What do you know about how Salesforce or Jet Blue or any other big business might be using RAG today?
[CONVERSATION BETWEEN MIHO AND KOSUKE]
Kosuke: So with Salesforce, it seems that they use RAG within this Einstein copilot search. It’s one of the functions of Salesforce, and this functionality provides this search within the past customer’s inquiry. It has actually improved the customers’ satisfaction by improving their customer service. With Jet Blue, they have this chatbot and they have actually expanded this capability by using RAG and also improved their customer satisfaction as well.
Charlie: I would say in the old days of AI—which is a very funny term considering how quickly AI is moving and certainly how popular it has become in a very short amount of time—but it seems to me that RAG is something that may have been needed from the very beginning. Is there anything else that is similar to RAG, a related technology that might be out there that does the same function or maybe improves upon it? Is RAG the beginning of a new wave of how we consume AI data and are there other technologies out there that people can use instead of RAG?
[CONVERSATION BETWEEN MIHO AND KOSUKE]
Kosuke: Okay Charlie, so there’s several technologies that she is looking into right now, but one of the areas that she is particularly looking into is AI agents. By having this AI agent, people no longer have to do the prompt engineering. This agent will decide what to retrieve. Within this agent it has the RAG technology as well, and this agent will decide what to retrieve from the database or from the information source. In order to do that, this agent will ask the person—which requires for instance that if I wanted to create some sort of report, this agent will ask me all the meta information, meta data that this agent requires—and agent will start searching for the needed information and creates this report automatically. Currently the user needs to type in the prompt in a sufficient way, but by having this agent, we don’t have to that anymore. The agent will do it all for us.
Charlie: That’s interesting to me because in the little amount that I suppose I’ve read to date on RAG, it’s always been just one direction and that’s returning data from the payload to me to improve the quality. I had never expected it to also act and maybe replace a prompt engineer. I find that very fascinating. Is that a new use of RAG that’s being used in the mainstream yet?
[CONVERSATION BETWEEN MIHO AND KOSUKE]
Kosuke: So the idea of having this AI agent has been discussed for a while, but people started to use this AI agent because of the innovation that’s happening within this AI technology area. This AI agent is getting really popular and many of the open sources are releasing this AI agent recently, and Miho is actually very, very interested in this AI agent.
[CONVERSATION BETWEEN MIHO AND KOSUKE]
Kosuke: Miho would like to share one other technology that she’s looking into right now. It’s multi-modal. So conventionally, generative AI was good at producing outputs based on the particular single model—like from texts, from audio, from one single image—but having this multi-modal technology, it can combine multiple inputs, like using not only text but also at the same time pictures, and create certain requested reports or output based on the multiple inputs. So for instance Miho gave us the case that when this AI model actually had this picture and the request was to create HTML based on this picture, this model was able to create this HTML successfully because of having this multi-modal functionality.
Charlie: I think multi-modal also might include other inputs. For example, another input source like IoT could be used in addition to text and images to help produce even better result sets.
[CONVERSATION BETWEEN MIHO AND KOSUKE]
Kosuke: Yes, she agrees. All the data comes out from the IoT sensors could be really interesting ones.
Charlie: So while we’ve been saying how R-A-G or RAG does improve the quality of the data that comes back and helps build more confidence in AI in general. Are there any reports out there at all that gives us a real hard number, to a percentage perhaps, of how much better the data result set is with and without RAG?
[CONVERSATION BETWEEN MIHO AND KOSUKE]
Kosuke: Yes, there’s a report from Perplexity.ai that by having the RAG, the accuracy has improved by 30%.
Charlie: Which is significant.
Kosuke: Yes, very. Yes.
Charlie: Wow. So I could easily see here now why using RAG in research with AI is a real benefit, but how does any one developer begin their journey into implementing RAG? Is it easy to implement? Is it a difficult process? Is it something that can be replicated across an entire enterprise?
[CONVERSATION BETWEEN MIHO AND KOSUKE]
Kosuke: Having this RAG functionality in the generative AI model is not that difficult. If there is a development company who is good at creating this generative AI, perhaps that company may deliver this RAG functionality within a few weeks. In order to do that, this dev company might use this API on the LLM. There are several databases which are good at actually bringing this RAG functionality, and also it is quite easy by using those services by bringing the RAG solution into the world. Miho actually expressed that a certain thing that the user needs to be really cautious about is the information source. As she expressed moments ago, this information source needs to be updated. If the information source is old or is not updated, then it lacks the accuracy of the data which has been retrieved. So the data source needs to be updated. This is the most critical part.
Charlie: Do you think that RAG will then help remove some of the bias that occurs naturally when using AI?
[CONVERSATION BETWEEN MIHO AND KOSUKE]
Kosuke: So [in] the matter of solving the bias of this LLM, RAG is not the actual answer. The bias within this LLM… has happened because of the creation of the LLM. It is outside of the RAG functionality. So she thinks that RAG is not a way to solve this bias within LLM.
Charlie: Okay. So we’ll start wrapping up this conversation, but I have two more questions for you that are questions that we did not discuss before our interview here today. The first one, Miho, is how does somebody stay current on technology such as AI when it’s moving so quickly. There’s a fear that anything I read today may be outdated as quickly as a few weeks or a few months. How do you stay current in such a fast-moving technology? That’s my first question.
[CONVERSATION BETWEEN MIHO AND KOSUKE]
Kosuke: So that’s a very, very difficult question to answer. She actually uses AI to gather all the latest updated information. The Perplexity.ai [report] that she mentioned was one of the methods that she’s actually using collecting all the information.
Charlie: That’s a perfect response. All right, my final question is just a personal question, and it is I wish I was many, many years younger—because this to me, this technology feels very different than any other that I’ve worked on since I was a very young developer many, many years ago. What do you say to somebody today who is just beginning their career or maybe even looking for a change in their career? What are your thoughts on guiding somebody into this area of technology? It’s certainly here to stay, but do you feel confident to recommend it as a true career path to somebody, and do you think it is worthwhile? Certainly you’re here, you’re in this space because you find it interesting, but speak to those people who are just thinking about a new career, or a new career in IT perhaps. In your own words, what can you say to these people?
[CONVERSATION BETWEEN MIHO AND KOSUKE]
Kosuke: So she thinks that this AI is promising land—she thinks this is it. But for the people that [are] thinking to this area as their career path, she just gave some advice that she wants them to use these many kinds of generative AI because… generative AI can be used in order to solve any kinds of problems, any kinds of tasks—like for instance generating call programs. It can be used in any ways, so just use it. Just getting about this generative AI or any AI technology, then they will see their path by themselves.
Charlie: Sounds like great advice. Thank you for that. I have to say that this has been such a unique experience for me. I’m thrilled. I’m so happy we were able to do this, and it just speaks to the power of technology. Here I am in the United States, you’re in Tokyo, and we’re having this three-way conversation. It’s amazing to me and I cannot thank both of you enough for your time, and for your expertise and your knowledge and everything else. Thank you, thank you, thank you so much, really from the bottom of my heart. Thank you very much.
Miho: Thank you, Charlie. I enjoyed it very much. Thank you so much.
Charlie: I will say that Miho will be presenting sessions in the United States, which is where I first met her—and she does speak English of course, but we just decided to do this in Japanese— but she had the courage and strength to speak in English to a completely English-speaking crowd. I was very impressed. What a great experience that was.
Miho: Thank you, Charlie.
Charlie: I am looking forward to seeing both of you later this year in the United States or in Tokyo. Wherever we decide to meet will be wonderful, and again, thank you. On behalf of TechChannel, thank you for doing this wonderful podcast with me. I hope we can meet again perhaps in the future and as AI continues to evolve only for the better, I think we will have many opportunities to regroup perhaps and continue this wonderful discussion. So thank you so, so much. All right, this wraps up our podcast. We look forward to you enjoying future podcasts with us. Thank you everybody for joining us and we will speak to you at our next podcast. Thanks so much. Bye now.