ChatGPT for Drupal content editors

Time

Wednesday, 10:00 am CDT - Wednesday, 11:00 am CDT

Location

Room 324 (Hybrid Block 1)

Organizer Note

This is a Hybrid session where the speaker will present remotely. The session will be displayed on a projector in the assigned room on-site. Participants may either join remotely via the linked YouTube stream or attend in-person.

Description

In this session, I propose an integration of Drupal and ChatGPT to enable new possibilities. I will demonstrate how to use the GPT model to generate text and complete tasks such as answering questions, summarising content, or generating new content. The integration could be done using the OpenAI API, which allows developers to access the GPT model and make API calls from within a Drupal module or theme.

Additionally, I will show how this generated text can be used in Drupal by displaying it on a custom page or block or by using it to generate meta tags or product descriptions.

This integration of Drupal and ChatGPT has the potential to revolutionise the way we interact with websites and web applications by providing a more natural and intuitive experience for users.

Learning Outcomes

Understanding how ChatGPT works
Integrate ChatGPT into Drupal
Generate content and Meta tags
Learn ChatGPT Prompt engineering

Prerequisites

Basic understanding of Drupal module development and Content Editing.

Synopsis

In this session, I will show how to integrate a Drupal Application with ChatGPT to help content editors improve their content creation workflow.

Speakers

Vincenzo Gambino

Software Engineer

I am a Senior PHP and Javascript Software Developer from London, UK, with experience building Web Applications for medium and big-sized businesses. I have spoken at several DrupalCamps and co-authored the book Jumpstart Jamstack Development.

- Co-Speaker at DrupalCamp London 2016: Using SAML as a single sign-on for Drupal 7
- DrupalCamp Munich 2016 : SSO WITH SIMPLESAMLPHP FOR DRUPAL 8
- Speaker at DrupalCamp London 2017: Drupal server configuration with Ansible and Drupal VM
- DrupalCampPoland 2018: Interact with Drupal via voice commands
- Drupal Mountain Camp Switzerland 2022: Decoupled Drupal with ReactJS and Alexa
- Drupaljam 2023: Drupal and OpenAI for Content Editors
- Drupal London Meetup 2023: Organisor and Speaker
- Orlando PHP June Meetup: OpenAI and PHP for Content Editors
- Drupal Dev Days Vienna 2023: Build a fully decoupled application with Drupal and NextJS

Track

Back-End

End users / Content Admin

Feedback Form

Transcript

For content editors. My name is Vincenzo Gambino. I'm a Drupal developer from London, UK, and I've been working with Drupal for 14 years now. I've been a speaker at several Drupal camps around Europe and the US, and I'm the co-author of the book Jumpstart Jumpstart development, while explaining how to build application using Gatsby and Netlify. And of course, it is also a chapter regarding Gatsby and Um and Drupal. So, uh, it's been almost one half year that we've been, uh, meeting OpenAI and ChatGPT. So November 2000, 2022, it was available to for everyone. Everyone started testing it, uh, playing with it, see what they can do. And the same thing, uh, did the companies. So companies saw potential on this tool, they start investing on it. They start creating tools, products and many other things. So tools like, you know, uh, generating report based on prompts, so on company information, based on your information and these, uh, raise some questions regarding data usage and privacy that end up with the EU AI act.

So basically they want to know what, um, what companies are doing with the data that you are inputting. So how they are using it, if they're using to retrain uh, their LLM or um, use it for something else. So we saw all this potential, we used it. So and we had we asked developers and many other um, uh, many other positions. We saw fierce and then Confirmation fierce because we thought that it was going to do our jobs, uh, easily and uh, and then confirmation that, uh, it means that we, um, the, the AI is not, uh, yet capable of taking our jobs. We need a superhuman supervision. So all these applications that the company were doing, they were just, like I said, doing application and drawing, uh, the market, actually, there is a theory that is basically the draw, uh, spaghetti at the wall theory, which means that basically they were just, uh, doing application, um, and drawing, like in the market. This comes from a, a way I call it a cooking technique that, uh, as Italian, I didn't want to investigate.

Furthermore. So basically, when you cook your spaghetti, uh, you to see if they are ready or not, you take one, you draw it to the wall. If it sticks, then it's fine, is ready. If not, uh, then, um, if it doesn't stick, then it's not ready. And this was the technique that we're using. So just building application fee is the work. So at the end uh, I will use it. Now. We use it for basically our daily job to, to help us to speed up the process of what we are doing. And, uh, and this is what is what we are going to see now. So today we're going to see what this OpenAI just to go through the terminologies. We're going to have a look at all the Drupal modules that are available at the moment on for Drupal. Uh, we see how we can use for our own content. Then at the end, we're going to have a demo and then the Q&A. So, uh, we have all these models that we can use for OpenAI. So we have the Dall-E model, which is a model that can generate and that the images we have whispers and tdes, which is a model that can convert audio into text and vice versa.

Also, we have the embeddings, which is a set of models that can convert text into a numerical form. So this is something we're going to see later. And then we have moderation, which is a model that can detect if a text may be sensitive or unsafe for example. And then we have the GPT model, which is the model that can understand and generate natural language. And this one that we use usually on ChatGPT, for example. The GPT is a language model developed by OpenAI. Uh stands for Generative Pretrained Transformers, and it is designed to for generating text, um, human like text in a conversational settings. We use it for chatbot or virtual assistants, for example. We use it for generating article blogs, blogs, or writing. Then we use it for translation. We use it for interactive gaming experience in dynamic storytelling, and we use it for personalized recommendation based on user preferences. How we can interact with GPT. We can interact through the chat, OpenAI or through the API. Before we dive in into the into the API we want.

I want to talk about some some of the key concepts. So we have the prompts which are essentially how you program the module, the module. So how you send some instruction, then you have the tokens that are unit of text that the language models reads and process. And then we have the model that we saw before. The API prerequisite means that we need a OpenAI account, a OpenAI key, and a payment method. These are some of the API endpoints. Example for getting information from OpenAI. So to get to use the models from OpenAI. So we have the completion module, the embedding, the image generation and text to speech. We do a Post request to one of those, uh, those modules. And then we get the information back. This is an example call. So we have the the header with the authorization with the API key. Then we have the model uh the prompt uh which is here and then the token and then the temperature, the temperature, um, in uh stats, the prediction, the productivity, the, the the how predictive will be the the the response.

So if you set, for example, a temperature of zero and you ask, um, uh, the sky is to OpenAI, it will return with the temperature of zero. The sky is blue. If you put the temperature higher, the more unpredictable will be the answer. So you can come back saying that the sky is, uh, this remote controller, for example. This with the high temperature, for example two. This is the response. So we have the, uh, the model, the date, um, the choices where we have the answer. So we have the text and then we have the usage. So we have the tokens that we send in the prompts, the token for completion. So for the answer and then the total token used. Now let's see what modules are available for Drupal at the moment. So we have the OpenAI ChatGPT integration, that which is a suite of modules that uh, that basically can, uh, can can connect to OpenAI and use some of their models. So what we're going to see today is the OpenAI SDK editor integration, the content editing tools, the ChatGPT explorer, and then the OpenAI Dall-E, uh, model as well module as well.

How we solve it. We solve it as usual using Composer and then we run Drush to install it. We then go to OpenAI to create a new a new secret key. So in this case already created. But if you create, if you click on create new secret key. You will see these windows here. So you give a name and then you click on Create Secret key. You copy the key and then you go into. You also need the organization ID or this is always from the OpenAI account. Then under Drupal in the OpenAI settings, you need to add the European key. Uh, and then the, the organization ID and now you're ready to go. You can start installing all the other modules which, which will connect then to all the models in OpenAI. So we start with the editor submodule. So we run Drush to install it. And basically this will basically add a button available for the editor. So we go into the basic HTML text format for example in the settings. And we add this button under the available buttons. Then in the same page we configure the the module, the temperature and the max token to be used between the the the prompt and the answer.

And this is what is going to look like. So the, the, the, the functionality will give us the option to do the x completion, the adjust the tone and voice to summarize the text we wrote and to translate it, and also to reformat and connect the correct HTML. If we have done something with the with the HTML in the body. So we click on text completion and we open with the the windows with the prompt. So we add for example in this case tell me the history of Drupal. And then the system starts writing on on the body directly of the of the node. Then we can change the tone. For example, I can ask them to change it as to tell me as a five years old. Then of course you can summarize it. This will not show any prompt. It will just summarize text you have selected in the in the body. You can also translate it. And this translation is not related to any language that you have installed into Drupal into your Drupal, uh, into your Drupal settings. So, uh, sorry. Yes, into double translation. So this is not a language that we've installed, is any language that you want.

Then we can test the OpenAI content tools submodules. So we install it. And this provides on the right hand side of the of the node edit mode, edit node. And then in the add node form a uh, the changing tone functionality to summarize and also suggests taxonomy, which we'll see later in the demo. So these are the options that are giving us so nice text tools. Changing tone. Summarize. Suggest. Taxonomy. Content. Title. This is how this Ajax taxonomy works. So we choose the body field that we want and we click on subject taxonomy. At the moment this is a basic functionality is just reading the body of the node and giving some form of the keyword. If we want to give, if we want to receive, um, the taxonomy based on our vocabularies, for example, in the prompts we can send also the array with all our our terms. And we can ask him to find the most appropriate keyword matching our our terms. So those are the moments, just the basic functionality. This is how we can extend the the OpenAI module.

So it's using the OpenAI PHP client package. And basically we need to use the class OpenAI. We create the client and then we use the completion method to, for example, create a to do what we've done. We've seen before so we can pass the module and the prompt. So this is the basic functionality of the OpenAI PHP client. And then in our Drupal module we can use the same thing. But in this case we pass again the module the prompt the temperature and the max token that we have been we set on our settings on the Drupal side. And then the response will be under choice, uh, the first item and then text. Another model that I want to talk about is, um, uh, is ChatGPT content assistant module with this module. Uh, this module is a lightweight, uh, version of the other module. You can create content, you can translate content and also create images. So you install it as usual, you require it from Composer and you install with Drush. And then in the settings we need to set the modules the model, the endpoint, the endpoint for creating images, the OpenAI access token or OpenAI key, the max token, the temperature, and then the content types where we want to enable this functionality.

So in this case, instead of having, um, in the WYSIWYG in the editor, here we have it as as a link here once you click the button. You'll see a pop up with the ChatGPT content generator. You can ask him any question and then it will start generating the content which will stay in the pop up. Then, if you're happy with the content, you can click use this content and it will be pasted directly into the body of your of your node. A for translation. This is related to the to the language that have been installed into Drupal. So you have, for example, installed Dutch. And then you can you can go and in the translation tab and you can click on translate using ChatGPT. And it will translate all the text field that you have in that node. How you can extend the ChatGPT content assistant so you can. Uh ChatGPT content assistant uses its own class and is doing its own call, is not using any package. So you need to need to use the GPT API and then the get GPT response text. This will do the same thing that we're doing on the other module using the PHP client, the OpenAI PHP client.

So now that we see that we can integrate OpenAI into our system, we can create many other types of functionality for the content editor. So they can add create content, they can map taxonomy to the content that they wrote. They can choose keywords. They can generate the titles. They can do many things, but not also for this or many other things. They can create an image, they can create images. For example, we're going to see in the demo. So now that I have created content, what we can do, we can basically index those content. We can improve the search experience. We can train GPT with our own data. So for example we have an FAQ. We can we can teach GPT in a with with with our FAQ so it knows our services. And we will use them bindings and rug that will go through now. So Rag is a retrieval augmented generation. So it combines the strengths of pre-training languages with retrieval system. And it's done in two. Part one is the retrievers which is the queries to the vector database. And then the generator that produce more informed and contextually relevant relevant response.

Basically in a human language. In this case it will be GPT. How we do the embeddings. The embeddings is a vector of floating points. So basically, just like a coordinate of what the semantic meaning of a word over sentence is. So for example, we can use it in search. We can use it for recommendation and for classification. So the embeddings can be represented in this way. Three. So this is an example. For example, we have the the word cat which gets the coordinate of two and three. Then we have a another word dog which gets the coordinate two and four. And then we have the point C which have the coordinate seven and one. So when he is doing a semantic, when OpenAI is doing a semantic, semantic search is checking the distance between the coordinates between the two words. So in this case cat and dog, they are similar because they are both animals. For example, while car is more distant from the two because not an animal, for example. Where we can store those coordinates. We can store it.

In this case, my example will be done in pinecone, which is a managed database vector, and you can use the API to, uh, to the queries to the to pinecone. So how we can integrate it. So we need uh, two modules actually three. So the first one is search API of course. Then we have search API and search API attach. So search API attachment. So the first one says API I will will add the functionality for searching in your within your content on the vector database and search API attachment will let you add attachment to the search API. So first we we, uh, we select the search API attachment settings. So we select in this case the PDF to text extractor. Because we're going to upload some PDF. The system is going to read the PDF and send it to the vector database for storing it. And after that, in the embedding setting. In the search OpenAI, we set up the model to use and which vector plugin we're going to use. And then in the API index we are going to add the field attachment, which is the first one here under general.

So this adds the search attachment. We had the type of binding. I also added the attachment here and here in the body. And here the body here has a binding embeddings. Then we add the block or search API so we can see a pop up where we can chat with OpenAI and make a question to our, uh, our documents. And this is an example. So I've uploaded a document regarding AI and in healthcare. So I'm asking uh GPT form to tell me about the AI and healthcare. And if return the content, the node that is that talks about AI and healthcare, and also the white paper that has been added into it here. So ready for the demo? Not yet. The. I want to talk about the the tokens. So tokens, as I said, is a piece of text that has been, uh, parsed out for further processing. So the machine doesn't understand human language, they understand numbers. So what we do, everything that we pass to to the LM is going to be translated into numbers. So first we get the words and we translate it into numbers. And then for the meaning for the semantic meaning of the word of the sentence, we again we translate it into numbers, in this case coordinate for example a token.

So we have this. This is the uh this is an example. So we have these poems, uh, these points. And uh, we have the token. So token usually is about four character. And in this case we can see it. So we have this sentence with 292 characters and there are 76 tokens. And then they're going to be translated into tokens tokens ID. Here. So this is what's going to be sent to the, uh, to the OpenAI. So to the LM to make you understand what we are saying. And, uh, now there is a restriction on tokens so you can send up to 4000 tokens. So when you have a document which is, for example, 400 pages long, you can use a technique where you can basically chunk this piece of content and send it to the embeddings and send and then send it to the vector database. So in this way. So you have the original text. Then you have the text splitter that does into chunks, and then it sends it to the embeddings model that will return uh coordinates coordinates for the meaning of that specific uh sentence or word.

And then this the bigger picture. So you have for example, a PDF or a piece of content. It will be separated into chunks. Each chunks will be sent to OpenAI for embeddings. It will return the coordinates. The coordinates then will be stored in a vector database which are ready to be used. Then from a Drupal application, someone had a prompt. Then the question will be embedded or forgetting the semantic meaning. Then it will do the query for the vector database. The vector will have a database, returns the results, and then the LM will will generate a more human response and they will send it back to Drupal. Okay, now we have the demo time. Yeah. We have our my application. So let's go back here. So for example here and we can again ask uh. And now is performing the search into the vector database. Taking a bit second. Okay, so now it turns return the, uh, the response, mainly based on the PDF that I uploaded, because my content is just this part here, nothing more. And most of the content is on the PDF, so it's returning PDF, uh, information based on the PDF.

So that means that we can upload many white papers, many PDF and let them index. And then the end user can go into our application and start questioning those, uh, PDFs before even reading it. That's really useful when we are doing some research and you want to go through a 200 pages PDF to see if it's relevant to you, but you can chunk it, you can store it in a vector database and then is ready to be used for the for the end user, for questioning and for information. So, um, it is as a good case on, on the legal field, on the healthcare as well. And um, if you're not using PDF, well, just the FAQ is good for the customer service where you put all your contact information and how to solve a problem into, into, uh, vector database. And then the user can use it through the AI. So let's see how we can configure it. So we go to configuration. And we have the OpenAI. His module here right. So here we have the settings. The available module that we can use. Then we have here the. Search API attachment that I showed before.

So here I've selected the PDF structure. I have here the binary to my to the PDF structure in my server. And then in the embedding settings. This is still from OpenAI. I have selected the content. And the model to use, and which vector plugin I'm going to use. You can choose between Milvus and Pinecone. I choose pinecone for this information, for this, for this demo. And then here you can add your pinecone information like um API key. Once it is set up, you can create your search. Such server, so it will be with pinecone. When the backend is automatically taken from the bindings. And then in my search. Index. I have added content and files. And then if so, set the server back on. In my fields. I've selected the SSA field attachment. Those fields are automatically appears automatically for each um, for each file field that you have in your application. Then as a professor, I have the file attachments processor. And excluding these files. And this. These four search. But then for the editor.

Let me go to the text format. Sector five here basic HTML. And the configuration. I have it here in the Act toolbar, the OpenAI. And here in the OpenAI tools is enabled, I choose the default model, the temperature and the max token. That can be used. And then here I'm ready to be used in a content so I can start creating some content. And here I can choose the text completion. But sample I wanted to press enter. It starts typing for me. And here I can see that it's receiving the response. And is done now. For example, I can select it. And they can, uh, for example, translate it. I. Now has been translated into Italian automatically. The same text that was there. Then from the from the. GPT added, uh, content tool we have here, so let's do it back in English. Translate it back. Okay. Now we can use one of these. For example, we can analyze the text. To see if there is any privacy violation. We can. We don't need to select because we choose the body field. We can analyze it. And this does not relate any content policy.

We can, for example, adjust the tone and this does not append it directly into the body. It will show it here. So let's write. Example. And showing it here. We can also summarize it. So if this. So we can copy and paste it in our summary we can suggest content title for example. He served Chicago in this case. And also we can suggest taxonomy. So. History. Chicago. Then the other module that I show is this one. So we can click here generate using chat GPT. And again. Now you will not write it here, but directly in the pop up. This takes a bit because it's not writing directly here. Okay. There is a timeout error. See this is the the risk of the of the live demo. Let's try it again. I think he's going to. There is some issue on their side, I guess. But anyway, let's do something more funny. So, uh, let me see a second. Let me go to configuration and we can do the dolly image generation. So we use the three. We can set this size. Here we get the URL response. And we can, uh, do something.

Let's see. So we are in Chicago so we can do Terminator, for example, playing for Chicago Bulls. Let's see what's what's happening now. Submit. Okay. Let's see. Are you ready? Okay. So this is the image that has been generated by GPT now with the prompt that we said. Okay. Uh, prompt wise, uh, we can test, of course. Um, as you want to do in our application. Uh, from here, for example, we we have the ChatGPT explorer. So this, uh, in this way, you can you can test ChatGPT, you can ask any question, you get the response. The response is based on this prompt here. So this is the profile settings. This is the the first uh information that they were going to give. So we are saying that we are Drupal uh Drupal website. So everything that we will ask will be returned related to Drupal. So in this case. So, for example, if I ask, uh, how to build a controller. And we submit. I will get the response based on this. So to build a controller in Drupal you typically create a module, blah blah. So here there are all the information for Drupal.

But for example we refresh to clear the cache. So. So for example, if I say instead of saying that hey, I'm a Drupal website, um, I am a Laravel developer, for example, and now I ask, uh, how to build. Controller. We submit. Now it's giving the response based on this profile on the Laravel developer profile. So how we can use this one? Again, we can use it for um, for example, for customer service where we can provide many information to to the prompt before the users ask any question. So this can be set as a profile for any, um, things that you're, that you're building. Okay. Now look at the slide. So in this session we saw, um, what's the status of the modules in the open AI modules in Drupal. We saw how we can connect the API and extend the current functionality. And also we say we saw how we can train the LM with our own data using using Rag. So now that, um, that we know how to do both things, we can we can just start developing and we can we can create we can talk to our, uh, to our editors, see what they need.

We can do custom, uh, application for our clients. We can extend our, our our our current website. We can we can do many things, uh, many things we want. But I'm just telling you is that you can also draw your spaghetti into the wall and, uh, that's it. So thank you.

>>:
Thanks. And if anyone has any questions, you can, uh, come up to the laptop up here and ask.

>>:
Um, have you had any problems with rate limits with organizations? Um, you know, a. Pardon. Sorry. Yeah. Any problems with rate limits like organizations hitting rate limits on search, things like that. Uh sorry. Having. Sorry organization. I'm not sure. Can you hear me? Uh, yes. No. Yes. Okay, okay. Have you had any problems with organisations hitting GPT rate limits? Um, rate limits, in which case in terms of, uh, of expenses or on the token one on the token limit, or. Aren't there API calls or. I'm not sure if tokens or API calls are different, but have you with sort of like you, you reach the top of your plan as far as either calls or tokens in a. Uh, so basically for that, um, you can just give me a second. And I think I have it in here. So this is the usage, right? Uh, so one of the limits that you can set is, uh, is this so. Right. So you can when you reach the limit, you can, uh, you can send a notification. So this in terms of, uh, API calls, uh, tokens are just the number of like, um, uh, just like words that you can send in in one single, um, uh, in one single request.

So when you do an API request, you have a limit on the number of words, uh, in this case, tokens you can send. Uh, but to avoid that one, you can, uh, chunk the, uh, the request in many in you can change the text in many more, uh, requests. But as, as far as I know, there is no limit apart from this one that you set on your own. And, um, that's it. So there are not limit uh, you can do in terms of numbers. But there is a per call cost, right? Yes, yes. So for example, today, uh, if you see during this demo I use the image module. I don't know if you can see my screen, but I use the image model, I use the embedding module models, the one call with GPT turbo and uh, couple of calls with GPT four. So everything is done on the node side was was GPT four and it was $0.15. And all the others were were less than $0.01. And the image model to generate uh, this image was $0.08. Okay, so it's a lot more expensive to generate than to call. Uh, yes. The more, yeah, the more coal you generate. Yes, the more expensive it becomes.

Yes. So. Okay. Thank you. Okay. Welcome. Okay if there are no other questions. Uh, thank you very much. Vincenzo. That was. That was great. Yeah. Thanks. Thank you very much for. Yeah. Coming out virtually. Yeah. Thanks. So here the resources and then there's contribution day on on Friday.

>>:
Yeah.

MidCamp 2024

ChatGPT for Drupal content editors

Organizer Note

Description

Vincenzo Gambino

DePaul University - Lincoln Park Student Center

Thank You to our Core Sponsors