Building a chatbot for a conference with GPT Engineer in 2 hours

13.12.2023

So we decided to participate in a local AI conference and needed something exciting and interactive to draw people to our booth. After some thought, we opted to create a chatbot. This wasn’t just any chatbot; it was designed to play a game with users by showing them an AI-generated image and challenging them to craft a prompt that would recreate this image as accurately as possible. Sounds fun, right?

The conference was just two days away, and with everyone else occupied, I volunteered to develop the bot. I must confess, I had never created a Telegram bot before. However, with the assistance of mighty AI, I believed anything was possible. So, I embarked on this endeavor!

For those unfamiliar, GPT Engineer is a tool akin to Auto-GPT. It’s capable of autonomously generating code and entire applications based solely on a description. That’s the theory, at least. In practice, it’s a bit more complicated. Yes, it can generate code, but running it successfully is another story. It provides a good starting point, but you’ll likely need to refine and adjust it to achieve your desired outcome.

To utilize GPT Engineer, you need to download a repository and set it up by following the instructions in the provided README file. It’s essentially a collection of Python scripts, so the setup process is relatively straightforward. Once ready, you should describe what you want to build in a ‘main_prompt’ file, which you’ll have to create. Then, run the tool and wait for it to generate the code.

When it comes to the description, precision is key. You need to be as detailed as possible about the desired outcome and the technologies you wish to use. Asking the right questions is crucial here. Also, using the latest version of the OpenAI API is advisable, as the results vary significantly depending on the version. Here is my glorious prompt:

Overview
This document outlines the design for a game where players guess the textual prompts used to generate images by DALLE-3. The game will be implemented using the OpenAI API, Python programming language, and Telegram API. It is a guessing game where players are shown an image generated by DALLE-3 and must guess the prompt used to create it.

Game Flow
Start: The player starts the game through a Telegram bot.
Image Presentation: A DALLE-3 generated image is presented to the player.
Guessing Phase: The player has 3 attempts to guess the prompt.
Submission: After 3 attempts, the player submits their final guess.
Data Storage: The game stores the player’s guess and other relevant data.

Technical Components

OpenAI API

Purpose: To generate images based on textual prompts.
Integration: Python scripts will interact with the OpenAI API to request image generation.

Python Backend

Functionality:
Communicate with the OpenAI API.
Handle Telegram bot interactions.
Manage game logic and state.
Store and retrieve data.

Libraries:
python-telegram-bot: For Telegram bot interactions.
requests: For making API calls to OpenAI.
os, json: For file handling and configuration management.

Telegram API
Purpose: To provide a platform for users to play the game.
Integration:
The game will be accessible through a Telegram bot.
The bot will handle user inputs and display images and messages.

Game Logic
Starting Image:
Read an image name from the config file, load it from the root folder.
User Interaction:
Present the image to the user via Telegram bot. 
Receive guesses from the user.
After each guess - use OpenAI API to generate an image from that prompt and 
show this image to the user.
Allow up to 3 attempts per image.
Data Handling:
After 3 attempts or final submission, save the user’s guess.
Store the image, prompt, user’s guess, Telegram username, and account info.

Data Storage
Location: user_data folder in the bot’s root directory.
Structure:
Subfolders named after the user’s Telegram username.
Filder contains info.txt file with all user data and an image generated by the user.

Configuration File
File Name: bot_config.json
Contents:
API keys.
Game settings (e.g., time limits, number of attempts).
Other configurable parameters.

Internationalization:
gettext should be used for all text messages

Pretty detailed as you can see. From that, the tool generated a project that I tried to run — to no success. It turned out that the data OpenAI was trained on was outdated. Telegram API changed significantly several months ago and it didn’t know that. So after some digging in Telegram documentation and cursing Pavel Durov for changing API so often, I was able to tun the code.

But before it can be run we need to register the bot with Telegram. It can be done by talking to a special Telegram bot called BotFather. He looks like this:

BotFather is the official bot by Telegram that allows you to create and manage bots. To start, search for ‘@BotFather’ in the Telegram search bar and start a chat with it. Than:

  1. Create a New Bot: Send the /newbot command to BotFather. BotFather will then ask you to choose a name and username for your bot. The name is what users will see in conversations and the username is how your bot is found on Telegram. The username must end in ‘bot’, like ‘examplebot’.
  2. Receive Your Token: After you’ve set the name and username, BotFather will provide you with a token. This is a unique identifier for your bot and is used to authenticate your requests to the Telegram Bot API. Keep this token secure and don’t share it with others.
  3. Configure Your Bot (Optional): You can use BotFather to set up your bot’s profile picture, description, about info, and more. These commands are optional but can help make your bot look more professional and approachable.

Now that you have your bot registered, let’s look at the code:

Here is the main.py class — game loader we going to be starting the bot with:

from bot_config import BotConfig
from telegram.ext import Application

def main(): 
    # Load our bot configuration
    config = BotConfig.load_config()

    application = Application.builder().token(token=config.telegram_token).build()

    # Instantiate the bot
    game = Game(config)
    game.start(application)
    print("Bot has started")

Application is a Telegram API class that manages the Telegram app.

The main logic of the game will be in Game class:

def start(self, application: Application):
    self.app = application

    # Add command handlers
    application.add_handler(CommandHandler("start", self.handle_start))
    application.add_handler(CommandHandler("rules", self.handle_print_rules))
    application.add_handler(MessageHandler(Filters.TEXT & ~Filters.COMMAND, self.handle_message))

    # Start the bot
    application.run_polling(allowed_updates=Update.ALL_TYPES)

Here we define the types of commands/messages our bot is going to react to. We are going to have 3 types: a start command that starts the bot, a rules command that displays game rules to the user, and any random message that is typically going to be an image description that the user created.

Because there could be many users playing simultaneously, we need to keep track of each session. For that, we will create a dictionary called sessions in the game constructor.

class Game:
    def __init__(self, config):
        self.sessions = {}
        self.config = config
        self.app = None

When we receive a message from a user we decide if this is a returning user — in which case we load their session from the user_data folder (from info.txt file), or a new user, in which case we create a new session:

def get_user_session(self, update: Update):
    user_id = update.message.from_user.id
    if user_id not in self.sessions:
        user = User(user_id, update.message.from_user.full_name, update.message.from_user.username)
        session = self.load_user_session(user)
        self.sessions[user_id] = session

    return self.sessions[user_id]
 
def load_user_session(self, user):
    user_folder = f'user_data/{user.telegram_name}'
    if not os.path.exists(user_folder):
        os.makedirs(user_folder)

    info_path = os.path.join(user_folder, 'info.txt')

    if not os.path.exists(info_path):
        user.image_submitted = ''
        user.attempts_left = self.config.max_attempts
        return GameSession(user, self.config)
 
    with open(info_path, 'r') as info_file:
        info_file.readline()
        info_file.readline()
        info_file.readline()

        user.image_submitted = info_file.readline().split(':')[1].strip()
        user.attempts_left = int(info_file.readline().split(':')[1].strip())
        user.last_prompt = info_file.readline().split(':')[1].strip()
        user.position = info_file.readline().split(':')[1].strip()
        user.company = info_file.readline().split(':')[1].strip()
 
    session = GameSession(user, self.config)
    return session

And finally, we want to define handlers for all commands:

async def handle_start(self, update: Update, context: CallbackContext):
    if update.message.from_user is None: 
        return
 
    session = self.get_user_session(update)
    await session.start(update, context)

async def handle_message(self, update: Update, context: CallbackContext):
    session = self.get_user_session(update)
    await session.handle_message(update, context)

async def handle_print_rules(self, update: Update, context: CallbackContext):
    session = self.get_user_session(update)
    await session.handle_print_rules(update, context)

As you can see we delegate handling of all commands to the user session class because the session will have the necessary context to do that. Here is the GameSession class where all real stuff is happening:

class GameSession: STATE_NOT_STARTED = 'not_started' STATE_WAITING_FOR_POSITION = 'waiting_for_position' STATE_WAITING_FOR_COMPANY = 'waiting_for_company' STATE_WAITING_FOR_PROMPT = 'waiting_for_prompt' STATE_WAITING_FOR_CONFIRMATION = 'waiting_for_confirmation' def __init__(self, user, config, is_running = False): self.config = config self.user = user self.dalle_service = DalleService(config.openai_api_key) self.last_image = None self.attempts_left = user.attempts_left self.state = self.STATE_NOT_STARTED if not is_running else self.STATE_WAITING_FOR_PROMPT

As you can see the game can have different states we need to manage. It can be waiting for a user prompt to generate an image or waiting for confirmation for certain actions. It also tracks remaining attempts and the current state of things. At the start, it will send the user game rules and will ask for initial image prompt:

async def start(self, update: Update, context: CallbackContext):
    if self.state != self.STATE_NOT_STARTED:
        if (self.attempts_left > 0):
            await update.message.reply_text(_("Game is already running. Please enter a prompt for the image:"))
        else:
            await update.message.reply_text(_("You have no attempts left. Please come back tomorrow."))
        return
 
    self.state = self.STATE_WAITING_FOR_PROMPT
    await update.message.reply_text(_("Welcome to the 'Culture Code' game! Here are the rules..."))
    await update.message.reply_text(_("Game Rules"))
    # Send the predefined image
    with self.get_predefined_image() as image:
        await update.message.reply_photo(photo=image)
 
    await update.message.reply_text(_("Please enter a prompt for the image:"))

Once the user replies, the message handler will be triggered. It will use a user prompt to generate an image using OpenAI Dall-e API to generate an image and display it to the user. After that, the game will ask if the user wants to submit the image to the judges or wants to try one more time (if the user has attempts left):

async def handle_message(self, update: Update, context: CallbackContext):
    if self.state == self.STATE_WAITING_FOR_PROMPT:
        # Check if the user has already submitted an image today and have attempts
        if self.user.image_submitted:
            await update.message.reply_text(_("You have already submitted a result today. Please come back tomorrow."))
            return 
        if not self.has_attempts_left():
            await update.message.reply_text(_("You have no attempts left. Please come back tomorrow."))
            return 
 
        prompt = update.message.text

        # Generate image from DALLE-3
        try:
            await update.message.reply_text(_("Generating image..."))
            self.last_image = self.dalle_service.generate_image(prompt)
            self.user.last_prompt = prompt
            await update.message.reply_photo(photo=self.last_image.image_data)
            self.state = self.STATE_WAITING_FOR_CONFIRMATION

            if (self.attempts_left == 1):
                await update.message.reply_text(_("This is your last attempt. Do you want to submit this image for validation? (yes/no)"))
            elif (self.attempts_left == 2):
                await update.message.reply_text(_("You have 2 attempts left. Do you want to submit this image for validation? (yes/no)"))
            else:
                await update.message.reply_text(_("Do you want to submit this image for validation? (yes/no)"))

        except Exception as e:
            await update.message.reply_text(str(e))

    elif self.state == self.STATE_WAITING_FOR_CONFIRMATION:
        if update.message.text.lower() == 'yes':
            if self.submit_attempt(self.last_image):
                await update.message.reply_text(_("Congratulations, the image was submitted!"))
                if not self.user.position:
                    await update.message.reply_text(_("Btw, what is your position in your company?"))
                    self.state = self.STATE_WAITING_FOR_POSITION
                    return
            else:
                await update.message.reply_text(_("Something went wrong! Please enter another prompt:"))
            self.state = self.STATE_WAITING_FOR_PROMPT

        elif update.message.text.lower() == 'no':
            self.attempts_left -= 1
            self.save_user_info()
            if self.has_attempts_left():
                await update.message.reply_text(_("Allright. Attempts left: ") + str(self.attempts_left) + ". " + _("Please enter another prompt:"))
            else:
                await update.message.reply_text(_("You have no attempts left. Please come back tomorrow."))
            self.state = self.STATE_WAITING_FOR_PROMPT

        else:
            await update.message.reply_text(_("Please answer with 'yes' or 'no'"))

    elif self.state == self.STATE_WAITING_FOR_POSITION:
        self.user.position = update.message.text[:100]
        self.save_user_info()
        await update.message.reply_text(_("How interesting! And what is the name of your company?"))
        self.state = self.STATE_WAITING_FOR_COMPANY

    elif self.state == self.STATE_WAITING_FOR_COMPANY:
        self.user.company = update.message.text[:100]
        self.save_user_info()
        await update.message.reply_text(_("Understood. We will validate your image and send you the results soon. Thank you for participating! Please come back tomorrow \xF0\x9F\x98\x89."))
        self.state = self.STATE_WAITING_FOR_PROMPT

If the user chooses to submit the image — the game ends and the image is stored to be assessed later by the jury:

def submit_attempt(self, image):
    if self.attempts_left > 0 and not image is None and not self.user.image_submitted:
        self.attempts_left = 0
        self.save_user_image(image)
        self.user.image_submitted = image.image_name
        self.save_user_info()
        return True
    return False

def save_user_image(self, image):
    user_folder = f'user_data/{self.user.telegram_name}'
    if not os.path.exists(user_folder):
        os.makedirs(user_folder)
    image_path = os.path.join(user_folder, f'{len(os.listdir(user_folder)) + 1}.jpg')
    with open(image_path, 'wb') as image_file:
        image_file.write(image.image_data)
    return image_path

def save_user_info(self):
    user_folder = f'user_data/{self.user.telegram_name}'

    if not os.path.exists(user_folder):
        os.makedirs(user_folder)
    info_path = os.path.join(user_folder, 'info.txt')

    with open(info_path, 'w') as info_file:
        info_file.write(f"Name: {self.user.user_name}\n")
        info_file.write(f"Id: {self.user.user_id}\n")
        info_file.write(f"Telegram account: {self.user.telegram_name}\n")
        info_file.write(f"Image submitted: {self.user.image_submitted}\n")
        info_file.write(f"Attempts left: {self.attempts_left}\n") 
        info_file.write(f"Last prompt: {self.user.last_prompt}\n") 
        info_file.write(f"Position: {self.user.position}\n") 
        info_file.write(f"Company: {self.user.company}\n") 

So this is basically all the code of the game! There are several other boilerplate classes to handle API calls and store user and image data.

Quality of the generated code and my opinion in general

For now, programmers can take a breather — AI is not capable of creating a full-fledged project, and there are several reasons for this:

APIs and frameworks are updated faster than modern models can be retrained, so they simply can’t keep up. Consequently, the model generates outdated code, and, what’s worse, a mix of code from various degrees of obsolescence. The model hallucinates — it invents methods and properties that don’t exist. If within a single module, the code can be quite coherent, in the scope of an entire project it varies: unused classes, dummy classes, unnecessary methods, etc., are generated. With increasing project complexity, all the above-mentioned problems intensify. The code quality is not perfect in terms of style and style guides: it doesn’t always specify variable types, may not handle exceptions, etc.

And the most painful point: the model lies.

Dr. House was right… and now not only human

The model actually deceived me once, telling me what I wanted to hear and not what was true! That reminded me of a developer during my time as a lead… just like a human. Soon it will start asking for a higher grade and smoothies..

So be careful. This point is not specifically about GPT Engineer, but about Chat GPT, which I used for final touches.

Nevertheless, I believe that AI has handled the task well. For a simple bot, complex code with a complicated architecture is not only unnecessary but would be a hindrance. A straightforward, head-on solution was generated relatively well, and even stylistically, the model generates logical and understandable code (to the point where comments are not even needed). Not every junior developer can do this, and few of them are capable. Considering that this was basically impossible just a year ago, there is something to think about.

The code can be found here: https://github.com/vsemogutor/culture_code_pulbic

Have fun!

Back

Share:

  • icon
  • icon
  • icon