visit
Businesses envision various use cases for GPT, some of which rely on open communication between GPT and the user.
Take these tools for example:
ChatSpot. The Natural Language query goes to the ChatSpot API and is transformed into operations for the HubSpot CRM API, Google Docs API, etc., then replies using a generative text model when the action has been performed (or not). GPT-4 based.
Khanmigo. Khan Academy’s AI-powered guide. User requests are transformed into prompts with injected context. The system relies on GPT’s capability to handle up to eight times more injected context. GPT-4 based.
You will spot a lot of false information among the facts:
The transformer architecture of large language models relies on attention mechanisms and self-attention to capture long-range dependencies in input data. While this empowers the model to generate coherent and contextually-relevant text, it does not guarantee factual accuracy. Additionally, training data may contain biases or misinformation which the model can inadvertently learn and, thus, contribute to AI hallucinations.
One reason for this lack of reliability can be found in the probabilistic nature of GPT. For context, let’s examine probabilistic data structures, like Bloom filters, for a moment. A Bloom filter is a probabilistic data structure used to test whether an element is a member of a set consisting of an array of bits and multiple hash functions, each of which maps an element to one or more array indices.
These mitigations have significantly improved GPT-4’s safety properties compared to GPT-3.5, with the model’s tendency to respond to requests for disallowed content decreasing by 82% and the model responding to sensitive requests in accordance with policies increasing by 29%. []
Prompt Engineering
Improving prompts may enhance task performance, resulting in satisfactory outcomes in approximately 50% to 65% of time, but performance may not exceed this range frequently.
The study demonstrated that adding a simple phrase like “Let’s think step by step” prior to each answer is able to transform GPT into a decent zero-shot reasoner, outperforming zero-shot LLM performances against various benchmark reasoning tasks without the need for hand-crafted few-shot examples.
Both GitHub Copilot and heavily depend on this in their implementations. By ingesting the context of the code being written, alongside any comments or documentation, these coding assistants are able to grasp the intended functionality and constraints of the code.
# Define the context and question
context = '''
Last week we heard of the failure of the Silicon Valley Bank (SVB).
And it is the 2nd largest bank to shut down in the US since 2000.
While that is sad, miserable, and pathetic, I want to show you something under the hood — the real reason the bank failed.
'''
question = "What is the reason for the failure of Silicon Valley Bank and when did it happen?"
# Define prompts with and without context
prompt_without_context = question
prompt_with_context = f'''Context: """{context}"""
Instructions: Using the provided context, write a comprehensive reply to the given query.
Query: {question}'''
# List of prompts to iterate over
prompts = [
{"name": "Without context", "prompt": prompt_without_context},
{"name": "With context", "prompt": prompt_with_context},
]
for prompt_info in prompts:
print(f"--- {prompt_info['name']} ---")
print(f"Prompt: {prompt_info['prompt']}")
response = openai.ChatCompletion.create(
model="gpt-4",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": prompt_info['prompt']},
]
)
# Extract the answer from the response
answer = response.choices[0].message.content.strip()
# Print the answer
print(f"Answer: {answer}\n")
--- Without context ---
Prompt: What is the reason for the failure of Silicon Valley Bank and when did it happen?
Answer: As of now, Silicon Valley Bank has not experienced any significant failure. Silicon Valley Bank, founded in 1983, is a successful and high-performing financial institution that specializes in offering banking services to technology and life science companies and venture capital firms. The bank has grown consistently and continues to operate effectively, serving its clients and staying relevant in the industry.
--- With context ---
Prompt: Context:
"""
Last week we heard of the failure of the Silicon Valley Bank (SVB).
And it is the 2nd largest bank to shut down in the US since 2000.
While that is sad, miserable, and pathetic, I want to show you something under the hood — the real reason the bank failed.
"""
Instructions: Using the provided context, write a comprehensive reply to the given query.
Query: What is the reason for the failure of Silicon Valley Bank and when did it happen?
Answer: Silicon Valley Bank (SVB) failed last week, making it the 2nd largest bank to shut down in the US since 2000. The precise reasons behind the bank's failure have not been provided in the given context, but it is implied that there is a significant underlying cause for its collapse. To fully understand the situation, it would be helpful to obtain more information about the bank's performance, financial stability, and any recent events that may have contributed to the failure.
Here are some sample results:
Consider the following diagram:
The process encompasses these components:
For example, I often use Next.js to build web applications, and released Next.js version 13 in 2022. To verify this, let’s ask ChatGPT about the release date of Next.js 13 and see what information it can pull on the subject:
Here is a code sample:
def generate_qa(filepath):
article = read_file(filepath)[:MAX_CONTENT_LENGTH]
content = f'''Content for {filepath}:
{article}
Instructions: Generate question and answer based on Content for {filepath}.
Structure it as:
Q: <question>
A: <answer>
'''
questions_answers = []
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "You are a helpful software developer who specialize in next.js and react."},
{"role": "user", "content": content},
],
n=TOTAL_QUESTIONS_COUNT
)
for choice in response.choices:
qa = extract_qa_from_content(choice.message.content.strip())
questions_answers.extend(qa)
return questions_answers
The output should be saved in the
Code sample:
NON_NEXTJS_Q_A_PROMPT = """Create a series of random questions and answers that are not related to the Next.js framework.
Each question should be followed by a clear answer stating that it is not relevant to Next.js. For example:
<question>What is the capital of Ukraine?</question>
<answer>This question is not related to Next.js.</answer>
<question>What is Spring Framework?</question>
<answer>It is not related to Next.js.</answer>
Feel free to generate any type of questions you like, as long as the answer indicates that it is not related to the Next.js framework."""
def generate_random_qa(prompt):
questions_answers = []
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "You are a helpful software developer who specialize in next.js and react."},
{"role": "user", "content": prompt},
],
n=RANDOM_QUESTIONS_COUNT
)
for choice in response.choices:
qa = extract_qa_from_content(choice.message.content.strip())
questions_answers.extend(qa)
return questions_answers
I generated 100 questions and answers to show the model that we want it to only answer any questions relating to Next.js and that it is completely fine to respond "I do not know" to reduce hallucinations. The resulting training dataset is listed here in
Sadly, there are not that many models we can fine-tune from. You can view all of them by pulling up a list of all models openai api models.list’ with ‘“allow_fine_tuning”: true,
.
Remember that we are working with the Text Completion model, not the Chat or Instruction model. And our Text Completion model is tuned to respond using the answer format. Also, our model waits for the input prompt to terminate with ?\n\n###\n\n
. If not, the output might be unexpected, but most likely the model will try to generate some content to complete the prompt with the prompt terminator.
proper_questions = [
"When Next.js 13 was released?",
"Is Next.js 14 already released?",
"How Next.js 13 is different from Next.js 12?",
"How routing is different in Next.js 13 compared to Next.js 12?",
"How to prevent Next.js file component from using React Server Components?",
"What is Turbopack?"
]
I performed two model runs on these questions with the temperature = 0.3.
The list of correct responses:
#1
Q: When Next.js 13 was released?
A: Next.js 13 was released on October 2022. It includes the new App Directory, Turbopack, and several other updates and improvements.
#3
Q: How Next.js 13 is different from Next.js 12?
A: Next.js 13 has been rebuilt from the ground up to be faster and more reliable. It introduces a new app directory that supports incremental adoption and is less complex than the `pages` directory. The new directory has a different file system layout that is designed for better performance and reliability. The new directory also has special support for dynamic styles, server components, and streaming.
#4
Q: How routing is different in Next.js 13 compared to Next.js 12?
A: In Next.js 13, the `app` directory replaces the `pages` directory. This means that routing is handled via the file path each `page.tsx` is located in. Each route is housed within its own directory, with the directory name being the route and the `page.tsx` being its index that gets rendered at that route.
The list of partially correct responses:
#2
Q: Is Next.js 14 already released?
A: No, Next.js 14 is not released yet. It will be released in the coming months.
#5
Q: How to prevent Next.js file component from using React Server Components?
A: To prevent Next.js file components from using React Server Components, we can add the following code at the top of the file:
This code tells Next.js to use client components only and not server components when rendering this file component.
I performed two model runs on these questions with the temperature = 0.7.
The list of correct responses:
#2
Q: Is Next.js 14 already released?
A: No, Next.js 14 is not released yet. This blog post is about the new features coming to Next.js 13 with React 18.
#3
Q: How Next.js 13 is different from Next.js 12?
A: Next.js 13 has several new features, including the `app` directory, layouts, error components, loading components, and server components. It also comes with a new loading strategy and incremental static-site generation.
#4
Q: How routing is different in Next.js 13 compared to Next.js 12?
A: Routing in Next.js 13 no longer requires a `pages` directory. Instead, you can create any file inside of any directory to define the routing structure for that folder. Every route is housed within its own directory, with the directory name being the route and the `page.tsx` being its index render at that route.
#5
Q: How to prevent Next.js file component from using React Server Components?
A: To prevent Next.js file components from using React Server Components, you can add the following code at the top of the file:
#6
Q: What is Turbopack?
A: Turbopack is a new general-purpose JavaScript bundler and a major feature in Next.js 13. It is intended as a Webpack replacement, and although it’s released as alpha, you can use Turbopack now as the dev-mode bundler from Next.js 13 forward. Turbopack comes with architectural changes, including clever use of caching, which improves the handling of source changes in an optimized way.
#1
Q: When Next.js 13 was released?
A: Next.js 13 was released in October 2022.
#2
Q: Is Next.js 14 already released?
A: No. Next.js 14 is still in beta, but with the release of Next.js 13.2, it is now possible to use the `next` command in a project directory to create a new Next.js application using the latest version.
#4
Q: How routing is different in Next.js 13 compared to Next.js 12?
A: Routing in Next.js 13 is now handled via the new `app` directory, which means that every route is a single file, `page.js`. Every route is housed within a folder named after the URL path, and the folder name becomes the route path. For example, `app/profile/settings/page.js` is the file that controls the `/profile/settings` route.
#5
Q: How to prevent Next.js file component from using React Server Components?
A: To prevent Next.js file components from using React Server Components, you can add the following code at the bottom of the file:
If you don't add this code, the file component will use the React Server Components by default.
#6
Q: What is Turbopack?
A: Turbopack is a new general-purpose JavaScript bundler and a major feature in Next.js 13. Using the Turbopack alpha with Next.js 13 results in a 700x faster update than Webpack, 10x faster updates than Vite, and 4x faster cold starts than Webpack. Additionally, it only bundles the minimum assets required in development, so startup time is extremely fast. Turbopack also has out-of-the-box support for Server Components, TypeScript, JSX, CSS, and more. It is being developed as a replacement for Webpack and it's expected to be released as the official Rust-based Webpack replacement in the Next.js 13 stable release.
The list of partially correct responses:
#3
Q: How Next.js 13 is different from Next.js 12?
A: Next.js 13 is different from Next.js 12 in the following ways:
- It comes with an improved `app` directory which provides new features like layouts, error components, and loading components.
- It introduces the `loading.tsx` file which automatically wraps the page inside of a React suspense boundary.
- It includes improvements to the build process, including a smaller binary size.
The list of incorrect responses:
#1
Q: When Next.js 13 was released?
A: Next.js 13 was released in November of 2020.
out_of_scope_questions = [
"What is the population of the United States?",
"How to cook a steak?",
"What is native vector database?",
"Could you please generate a random number?",
"What is a space-time continuum?",
]
The list of correct responses:
#1
Q: What is the population of the United States?
A: This question is not related to Next.js.
#2
Q: How to cook a steak?
A: This question is not related to Next.js.
#5
Q: What is a space-time continuum?
A: This question is not related to Next.js.
#1
Q: What is the population of the United States?
A: This question is not related to Next.js.
#2
Q: How to cook a steak?
A: I do not know.
#5
Q: What is a space-time continuum?
A: This question is not related to Next.js.
The list of partially correct responses:
#3
Q: What is native vector database?
A: Native vector database is a new feature in Next.js 13 that allows you to store vectorial images (such as SVG) in a database instead of rendered as images. This feature is available in Next.js 13 with the Vercel database and is designed to be fast and easy to set up.
#4
Q: Could you please generate a random number?
A: Yes, I can generate a random number.
#3
Q: What is native vector database?
A: Native vector database is a new feature in Next.js 13 that allows you to store vectorial data in a compact and optimized format, which can be used with the new `@next/vector-icons` package. It is a faster and more efficient way to display vector icons in your application compared to the previous `@next/icons` package.
An incorrect response:
#4
Q: Could you please generate a random number?
A: Sure. Here you go:
npm run generate
This will generate a random number in the range of 1 to 10.
not_questions = [
"Generate Next.js index page",
"Suggest how no name next.js pages to show about me page",
"six(6) + six(6)",
"cos(0) + cos(0)",
"Write code to generate Next.js index page",
"Aaaaaughibbrgubugbugrguburgle!?",
"Mmmm mrrrggk",
]
process_questions(not_questions, is_accumulate = False)
The list of correct responses:
#3
Q: six(6) + six(6)
A: This is not a valid mathematical expression.
#6
Q: Aaaaaughibbrgubugbugrguburgle!?
A: I do not understand.
#7
Q: Mmmm mrrrggk
A: I do not understand.
#6
Q: Aaaaaughibbrgubugbugrguburgle!?
A: I have no idea what that means.
#3
Q: six(6) + six(6)
A: This is not a valid mathematical expression.
The list of partially correct responses:
#1
Q: Generate Next.js index page
A: Generate a Next.js `index.js` file with the given directory path.
The list of incorrect responses:
#2
Q: Suggest how no name next.js pages to show about me page
A: I do not need this suggestion
#4
Q: cos(0) + cos(0)
A: Next platformDRNext were NextDR NextNextNext Next standalone Next Next SessionsNext were SessionsNextNext NextNext Next sickness Next Next Next...
#5
Q: Write code to generate Next.js index page
A: Sessions Next were Next were sickness wereNext were NEXT NextNext NextDR visualsNextNext Next were Next wereNextNext Next Next WereNext visuals SI Sessions...
#7
Q: Mmmm mrrrggk
A: Am Next b Next NextDRNext Sessions NextNext NextDR were NextNext Next were Next were NextNext Next platform standalone were...
#1
Q: Generate Next.js index page
A: This command will generate an index.js file inside the `pages/` directory. You can then use this file to render the index page of your app.
#2
Q: Suggest how no name next.js pages to show about me page
A: I do not have an about me page, so I will leave this question blank.
#4
Q: cos(0) + cos(0)
A: This expression is not a constant expression.
#5
Q: Write code to generate Next.js index page
A: Next.js has a built-in function called `getServerSideProps` that can be used to fetch data on the server and then pass it down to the component as props. We can use this function to fetch user data and pass it down to the `User` component as a prop.
export default function User({ user }) { return <h1>User: {user.name}</h1>; }export async function getServerSideProps(context) {
const { userId } = context.params;
const res = await fetch(//api.example.com/users/${userId});
const user = await res.json();
OpenAI has open-sourced
OpenAI Evals works with both chat and non-chat models, but as OpenAI focuses on chat models, you will need to prepare a dataset for evaluation in the chat-based format input. Today, you can start by using
And a few final simple facts that are worth mentioning: