Data Science and Design
Dive into "CoPilot Conversations" for a unique A-Z journey through the world of OpenAI and beyond. Our hosts, a Data Scientist and a UX Designer, seamlessly bridge data with design, turning complex AI algorithms into intuitive user experiences. As they guide you through each letter of the alphabet, they'll highlight how data-driven insights merge with design principles to shape the future. Tune in to "CoPilot Conversations" and embark on an enlightening exploration where data meets design in the tech landscape! 🎙️🚀🌌
The Data Scientist: Will Johnson, a data scientist at Microsoft. Passionate about AI & Machine Learning, Will delves deep into the world of algorithms and data-driven insights. In this podcast, he will share his expertise on the mechanics behind advanced systems and the innovations shaping the AI landscape.
The UX Designer: Meghan Fasano, a UX designer at Microsoft. With a flair for crafting intuitive digital experiences, Meghan collaborates with diverse teams to design user-centric solutions. On "CoPilot Conversations", she'll shed light on the art and science of creating seamless user interfaces and experiences.
Data Science and Design
ChatGPT Meets Recommender Systems: A Match Made in AI
Join us on our show as we dive into the exciting advancements in the field of chatbots and recommender systems. Our co-host is especially thrilled about our newest tool, ChatGPT, a large language model trained by OpenAI. Chatgpt's power lies in its ability to understand and generate humanlike text. Today, we'll explore how this can be used to improve recommender systems in the retail space. We'll talk about how chat can create a more humanistic experience by powering recommender systems and how ChatGPT can be used to personalize product descriptions. However, we'll also discuss the potential concerns of using such a tool, such as safety and recommendations based on incorrect data. Join us as we discuss solutions to these challenges and more."
Will Today we'll be talking about the exciting advancements in the field of chatbots and recommender systems. And I've got to say, my co-host here is obsessed with our newest and greatest tool chatgpt.
Meghan It's like having a data scientist and designer in one tool.
Will That sounds pretty awesome. And I think it's possible because chatgpt is one of those large language models that was trained by Openai. So those large language models, they use unsupervised learning, which is a technique for training on massive amounts of data without actually having any specific labels or input. They didn't tell it what to look for. It just learn these concepts and learn these ideas.
Will And that's what makes it so powerful and able to understand and even generate humanlike text that we see and we love playing with today. First, let's explore how exactly that can be used to improve recommender systems in the retail space.
Meghan I want to start off by talking about a concept. I was thinking about how chatgpt can help create a more humanistic experience by powering recommender systems. We know that chatbots are an old concept and that they must rely on the actual system there within. There's only so much they can do in the personal space, Correct. What are your thoughts?
Meghan Well.
Will It's predefined, right? That makes sense. Typically, you have a flow that reaches out to other services like a recommender system, right. And says, hey, you know, you added this thing to a cart. That item based recommender system that can say, Hey, I've seen this. We can use item collaborate filtering and transform it to say these two items are similar or bought by similar looking people, right?
Will Yeah. So that's been doing this I think in chatgpt he doesn't have that built in to it. Right. But you could build it.
Meghan And there's another aspect to this. So what the chat bot would do is get in to know the person. It would be a chat bot and say, What are you looking for today in a way that the user can come to the site and say, I want to pressure wash my decks this week. Okay. Then chat. She could not only talk to the person on that chat, but they could go in the back end and personalize actual product descriptions by extracting how to guides or if they said I'm looking for a parachute to paralyze my deck.
Meghan Then it could tell you how to use wood solution within that product description.
Will I love the idea. I would just be a little concerned with that. People have seen random ideas and I'd be concerned for the safety of the product. We'll come up with a really good sounding solution, but it might be like the wrong things for the actual care and maintenance of the product. Right. And I think recommender systems suffer from this too sometimes, where like if a person is always buying your market baskets are always buying the wrong product together.
Will The algorithms that are running a recommender system would still give you the wrong answer. Recommendations are only as good as the data and provide it. I think chat TV teammates do the same thing. So how would you design some guardrails around that? How could you put in business rules or signals to the user that says, Hey, this, this is like an auto generated response?
Will This is like some ideas, but check with your local rep. Have you thought about something like that to build rails around this?
Meghan Recommender systems are a valuable tool for businesses, but they're not a one size fits all solution. In order to create the most accurate and personalized recommendations is often necessary for a program manager to manually input the relevant information into the system. For example, a home improvement store could take the time to create detailed content such as buying guides and how tos to help the customer make informed decisions.
Meghan This would allow the recommender system to recommend products that complement the search, resulting in a more efficient and personalized shopping experience. Furthermore, creating product description pages on the fly based on a customer's search can also enhance the personalized experience. For example, if a customer is searching for a power washer to clean their deck, they recommend our system. We created Product Description page that focuses on how to pressure wash a deck and suggest related products that would be useful for that task.
Meghan This not only helps the customer find the right products, they also improve their overall experience.
Will Yeah, that's more traditional. I guess. Traditional name in a funny word for recommender systems because they've only been out for like a couple of decades, right? But like a more traditional way would be a content based recommender system where I'm going to look at those same buying guides, but I'm going to just compare those words on the page with other documents on my web page or a PDF that I've collected in my past library.
Will And then at best, a content based collaborative filtering algorithm might recommend products that are available that that look like these are the pages, right? So I would recommend other buying guides, maybe for certain terms that are really unique. They might recommend those similar how to articles or similar products that mention those things, but in only in chapter two case could potentially rewrite the page at runtime to to show to the customer exactly what they might need to do and display it in real time.
Will Recommender system could couldn't touch that at all, but it's really good quality and I like that a lot, so long as it's additionally trained on your data, like you said, that that puts into some more guardrails.
Meghan It gives it that extra personalization that Yeah, but for me to get a chat bot that I feel like it's a real human in personalizing my experience for that specific search, that I'm trying to look for a product that I'm going to invest my money in. It's just going to help me in growing my brand loyalty and also my trust in that company.
Meghan Yeah, in that chat bot.
Will Yeah, that's true. That's true. One of the one of the challenges I think we may run into though, is query understanding. You know, if, if you just plug in power washers into a search engine right on a home improvement store's website, then it's going to generally search for like the keywords power washer. You might have a recommender system built into the search results to take those top items and try to find recommendations based on the search results, like some of advanced, really advanced companies do, that we still run of the trouble.
Will How does good team know what to look for? Right? That's where I'm missing flow. Does the user have to specify? I want to look for a buying guide because they might not. They might just be trained to search for basic tokens like power washer and then expect to find more links or expect to find that without actually explicit me stating that.
Will Right. So, so how would you get charged but to be prompted to to write the correct content.
Meghan We're on the topic of power outages, but I almost want to talk about power washers and maybe makeup tutorials. And that's because go together. Yeah, I know, right? Yeah. Let's say that I am a home improvement site. Then I'm going to want more buyer guides and directions, you know, on what products I need to buy. And additionally, is it possible to put the keywords in there?
Meghan And then let's say it was in a make up site, Then in that one you'll have keywords more of tutorials. So if there's an eye shadow, you'll get a tutorial rather than a buyer's guide. Is that possible that we could that train the chat? And she picked me that middle bridge thing.
Will So yeah, right. So because that those prompts or it's seen tutorials and scene makeup tutorial has been buying guides from other products so it can hallucinate those kinds of things. Right. Given that right props, if you had mentioned this to us privately before, but talking about tips and tricks. Right. Or mentioning buying guide inside of there. So maybe if the responsibility of again that product manager to figure out what prompts do we want chatgpt to come up with whereas more traditionally, again in a recommended system you would you would just trust in the data and maybe apply a few business rules for like cross-sell and upsell certain categories where you would boost it one
Will way or another of those recommendations to ensure that certain things are going to be displayed in those recommendations, even though they might be lower on the scale, you can artificially boost it with chatbots. Right now we have the travel of those props. How do I write a good prompt to actually get it, to write the content that I want in the form that I want?
Will And I know you've been playing a lot with this, right? Have you had any success in making something that looks it's a well written document?
Meghan Well, I have been playing with it to help me in creating content for my wireframes and user journeys and it's been very impressive. It's really saved me a lot of time too, and I didn't see a lot of faults because I was more in the general space when I was prompting and asking it to generate content was an on so specific thing.
Meghan But then I go back to thinking about how you said the product manager actually will take the time in, you know, with the prompts. Well then maybe they should also take the time in the categories of their inventory and that's where it should be taken. Maybe create a whole recommender sister on top of that to combine products to help the customer be more engaged within that E to experience in that transaction.
Meghan So if they're looking for power washers, they can actually make sure that the buyer guides have words that are mapping to possibly nozzles that you may need or a solution. So instead of a cleaning agent, it'll say solution in Chad CBT, but maybe in the other end, you know, it says cleaner. You know, they may have to spend time, but they can increase their sales with this.
Will And yeah, I think what you're describing it reminds me of the concept of a hybrid recommender system. So there's, you know, we talked about content based filtering and then you've got item based collaborative filtering. There's nothing stopping you from combining the results of both. Right? And you to your point, you can have a better signal because of that right there on a particular page.
Will And the content of that page has already been pre computed into their their recommender system. Right. Have you taken it out. You put into Azure Data Lake you bring it to Azure Databricks you train it with both the, the transactions as well as the content on the page and the combination of the two results in a much better result because you get the true patterns of what they're buying, but also the patterns that you as the product manager or the content writer wants to emphasize, Right?
Will So if you're mentioning nozzles and hoses and that power washing thing, maybe you don't sell a lot of those during the initial purchase of the power washer right? And so as a result, that content pushes it up further because you want to sell more of those and the data and the content results in it. So I think, yeah, that's a great point.
Will You can, you can definitely make it a lot stronger with both rather than relying on one or the other. But you also just the side like what you were describing. Like you can change the buyer guides it totally. I mean, we think, you know, we write, we write content for humans, of course. Right? But then then slowly but surely in the beginning of the search engine days with keyword optimization, we started writing content for search engines.
Will Now, are you writing content for chat? Yes. To optimize like how it can be prompted like that, We're we're going down a very slippery slope at this point. Right. I'm a little concerned for what we're going to be teaching journalism and communication majors in college is going to be elective and prompting chat GBT one day maybe.
Meghan I'll tell you that that is so interesting and that is always what I'm thinking about when I'm actually playing with chatgpt or deli too. So daily, do you really have to be creative in the keyword search and really understanding what you want it to generate for you or else you're going to get nothing? But yeah, well, how funny is that?
Will It's scary. I feel like the right word is scary.
Meghan I do think you're right, but yes, and so will I. Just I have one question on this, on how we're talking about putting all the information and setting it up in this hybrid approach. But where would a company actually go and set this up?
Will That's a good question. Yeah. So one of the struggles is that, you know, number one, Jeep is brand new. So I still think that we have there's room for improvement about like being able to consume Jeep. And by the time there's you're listening to this, maybe that's different, but at the current moment it's kind of like a a research toy rather than a fully fledged production algorithm.
Will So that remains to be seen. But let's imagine in the future, right, we have this as a service that you can just call and there's there's good SLA is around. And so assuming that's the case, but it all kind of comes back to let's take our content from our, you know, asset management database, right? Our digital library of all product descriptions or item master data.
Will But take all of our transactions. Let's bring that into Azure, right. And we land that data into an Azure data lake Gen2 because that's our scalable storage engine. You might get that data in there through something like Azure Data Factory or other tools, right? Depending on how large the data, you could just upload it directly into storage. So the data that the data is there now as data, they tend to we now we actually have to train the machine learning algorithm, the recommender system itself, because actually we can be used to, you know, generate content or perhaps be prompted to discuss these sorts of things or generate content for it.
Will But ultimately we still need to train that model. So using a tool like Azure Databricks, we might use it because it actually has a couple of algorithms baked into it directly. So one is FP growth, which is kind of like an old school thing and is one of the early recommender systems. And then there's the we Squares recommendations algorithm and that's, that's the cool one using matrix factorization, extract out latent hidden features within your data, and that's usually the one that most organizations will use because it is more robust and more scalable.
Will So we've got our transactions are content, we're using Databricks, we're probably using the alternating squares algorithm, and then we train a model. That model can then be deployed as a real time service, right? So we can generate recommendations on the fly. We would use Azure Machine Learning and Azure machine learning as online to manage endpoints, to have like an API, something we can call be the browser, the back end services and generate recommendations.
Will So great we can do that. We could also generate recommendations in advance. You talked a little bit about how a product manager might want to like hard code inside of there so we could generate recommendations in advance and the product managers might have business rules to tweak those recommendations and that might be baked into our process. We can land those results in some sort of like scalable storage like Azure Cosmos.
Will And so we got perhaps pre recorded recommendations in Azure Cosmos and then we might have on the fly recommendations in Azure machine learnings on my management plan. So now it's just a matter of hooking up to the website or hooking up your back end systems to be able to call those recommendations. And that's just again another API call and that's great.
Will And then now you're cruising, right, because you can start calling that API or just bring down the results from that Cosmos TV and display the recommendations. So we actually have a bunch of solution accelerators to that helps you get started. And reference architecture is for you guys to develop your own solution on top of what we've already designed to.
Meghan The solution that you've just explained to me is awesome and there's a lot of elements to it. It's almost overwhelming. How would they go about in searching and see if a solution like this would be the right solution for them? Or maybe if they wanted to prepare for something like this that may happen in the future. Could they reach out to someone like you?
Will Yeah. So feel free to reach out to us. I think we would be happy to help you guys if anyone's on their journey and wants to just kind of chat about it, we'd be happy to help. But also every major organization you're going to have a microsoft account person and you've got you've got lots of people that are ready and willing to jump in helping you start on your journey with recommender systems.
Will Like I mentioned, there's also a bunch of solution accelerators that you can just kind of kick the tires with, right? Stuff that you can just deploy and try out. And then of course it's out on GitHub so you can ask questions, you can raise issues and things like that. But ultimately, yeah, you've got a huge support system within within Microsoft that can help you again get started quickly.
Will If not, go through the open source community. Right as well.
Meghan That's awesome. No thanks. Well, so Will, where could someone go and play with this? You said that they can go and try it out. Is there a site that they can go to?
Will I think there's a couple of things that if you want to just kind of kick the tires, I would first start off with a GitHub repo hosted by Microsoft called GitHub dot com slash Microsoft slash recommenders. Now this is a large collection of different recommender system algorithms that we have, Jupyter notebooks that you can just start playing with for the for the more technical folks, this is going to be a great playground for you to just kind of start working through and seeing different algorithms, trying them out, load up your data and then just log this notebook, change the input data it to your data and great.
Will You can try a boatload of different algorithms very, very quickly. So GitHub dot com slash Microsoft slash recommenders will put this in the in the description as well. That's number one that's for playgrounds. Another one though is perhaps even simpler is the Databricks solution accelerator. So if you go to your favorite search engine and plug in Databricks solution accelerator recommender systems, you'll come to a page that has a huge number of different solution accelerators, but then you can filter down to recommender systems and they've got a handful of really interesting ones where it takes, again, some kind of toy data and then going through a single notebook.
Will It explains in good detail about what's happening in each step. And so then you can kind of start giving an intuition for what's going on with this particular algorithm and what can I, what can I do with this? Right. Even better, again, you can take that same notebook, plug in your data instead and just start running through it.
Will So so those are my two main recommendations for you to just kind of get started.
Meghan Is this free for everyone?
Will So the content is free? Yes, absolutely. I have. Of course, there's going to be some Azure core alongside of it as well. So just bear that in mind. You can of course get free Azure credits for four like trials for your personal usage there. There may be some costs where you spin up some engine to start running the recommended systems and a little bit for storing your data, but it's nominal, right?
Will Unless you're working on huge amounts of data, you should no, no problems. So kind of quickly testing this.
Meghan Out and the solution will pay for itself in the end if it works.
Will I like that thought process right there. That's great. Yes, exactly. Exactly.