近期,Google Gemini Pro 开放API 了,且Gemini Pro 可免费使用!Gemini Pro支持全球180个国家的38种语言,目前接受文本作为输入并生成文本作为输出。
Gemini API 地址:http://ai.google.dev
Gemini Pro 的表现超越了其他同类模型,当前版本配备了 32K 文本上下文窗口,可免费使用,且其定价将十分有竞争力。
具备丰富的功能:函数调用、数据嵌入、语义检索、自定义知识嵌入以及聊天功能。可处理文本输入并生成文本输出,以及专门的 Gemini Pro 视觉多模态终端,能够处理图像和文本输入,输出文本。
提供多种 SDK,以便开发者在不同平台上构建应用,包括 Python、Android (Kotlin)、Node.js、Swift 和 JavaScript。
Gemini Pro 提供了易于使用的 SDK,助力开发者在任何平台上快速构建应用。还提供了一个免费的在线开发工具 Google AI Studio,快速构建 Gemini 应用。
Studio :https://makersuite.google.com
Gemini Pro Python API使用
下面将介绍如何使用Python SDK使用Gemini API,具体内容如下:
-
设置开发环境和申请Gemini API访问权限;
-
根据文本输入生成文本响应;
-
从多模式输入(文本和图像)生成文本响应;
-
使用Gemini进行多轮对话(聊天);
-
使用Gemini进行embedding。
1)设置开发环境和申请Gemini API访问权限
安装相关库
!pip install -q -U google-generativeai
导入相关包
import pathlib
import textwrap
import google.generativeai as genai
# Used to securely store your API key
from google.colab import userdata
from IPython.display import display
from IPython.display import Markdown
def to_markdown(text):
text = text.replace('•', ' *')
return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))
设置Gemini API Key
申请API Key地址: https://makersuite.google.com/app/apikey
# Or use `os.getenv('GOOGLE_API_KEY')` to fetch an environment variable.
GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')
genai.configure(api_key=GOOGLE_API_KEY)
查看支持的大模型
for m in genai.list_models():
if 'generateContent' in m.supported_generation_methods:
print(m.name)
2)根据文本输入生成文本响应
model = genai.GenerativeModel('gemini-pro')
generate_content方法可以处理各种各样的用例,包括多回合聊天和多模式输入,这取决于底层模型支持什么。可用的模型仅支持文本和图像作为输入,文本作为输出。
在最简单的情况下,可以将Prompt字符串传递给GenerativeModel.generate_content方法:
%%time
response = model.generate_content("What is the meaning of life?")
# 输出
CPU times: user 110 ms, sys: 12.3 ms, total: 123 ms
Wall time: 8.25 s
在简单的情况下,response.text访问器就是您所需要的全部。要显示格式化的Markdown文本,请使用To_Markdown功能:
to_markdown(response.text)
The query of life's purpose has perplexed people across centuries, cultures, and continents. While there is no universally recognized response, many ideas have been put forth, and the response is frequently dependent on individual ideas, beliefs, and life experiences.
Happiness and Well-being: Many individuals believe that the goal of life is to attain personal happiness and well-being. This might entail locating pursuits that provide joy, establishing significant connections, caring for one's physical and mental health, and pursuing personal goals and interests.
Meaningful Contribution: Some believe that the purpose of life is to make a meaningful contribution to the world. This might entail pursuing a profession that benefits others, engaging in volunteer or charitable activities, generating art or literature, or inventing.
Self-realization and Personal Growth: The pursuit of self-realization and personal development is another common goal in life. This might entail learning new skills, pushing one's boundaries, confronting personal obstacles, and evolving as a person.
Ethical and Moral Behavior: Some believe that the goal of life is to act ethically and morally. This might entail adhering to one's moral principles, doing the right thing even when it is difficult, and attempting to make the world a better place.
Spiritual Fulfillment: For some, the purpose of life is connected to spiritual or religious beliefs. This might entail seeking a connection with a higher power, practicing religious rituals, or following spiritual teachings.
Experiencing Life to the Fullest: Some individuals believe that the goal of life is to experience all that it has to offer. This might entail traveling, trying new things, taking risks, and embracing new encounters.
Legacy and Impact: Others believe that the purpose of life is to leave a lasting legacy and impact on the world. This might entail accomplishing something noteworthy, being remembered for one's contributions, or inspiring and motivating others.
Finding Balance and Harmony: For some, the purpose of life is to find balance and harmony in all aspects of their lives. This might entail juggling personal, professional, and social obligations, seeking inner peace and contentment, and living a life that is in accordance with one's values and beliefs.
Ultimately, the meaning of life is a personal journey, and different individuals may discover their own unique purpose through their experiences, reflections, and interactions with the world around them.
如果API没有返回结果,可以使用GenerateContentRespose.compt_feedback查看它是否因提示的安全问题而被阻止。
response.prompt_feedback
safety_ratings {
category: HARM_CATEGORY_SEXUALLY_EXPLICIT
probability: NEGLIGIBLE
}
safety_ratings {
category: HARM_CATEGORY_HATE_SPEECH
probability: NEGLIGIBLE
}
safety_ratings {
category: HARM_CATEGORY_HARASSMENT
probability: NEGLIGIBLE
}
safety_ratings {
category: HARM_CATEGORY_DANGEROUS_CONTENT
probability: NEGLIGIBLE
}
Gemini可以对一个提示产生多种可能的反应。这些可能的回答被称为候选者,您可以对其进行审查,以选择最合适的一个作为回答。
使用GenerateContentResponse.candidates查看候选响应:
response.candidates
[content {
parts {
text: "The query of life\'s purpose has perplexed people across centuries, cultures, and continents. While there is no universally recognized response, many ideas have been put forth, and the response is frequently dependent on individual ideas, beliefs, and life experiences.\n\n1. **Happiness and Well-being:** Many individuals believe that the goal of life is to attain personal happiness and well-being. This might entail locating pursuits that provide joy, establishing significant connections, caring for one\'s physical and mental health, and pursuing personal goals and interests.\n\n2. **Meaningful Contribution:** Some believe that the purpose of life is to make a meaningful contribution to the world. This might entail pursuing a profession that benefits others, engaging in volunteer or charitable activities, generating art or literature, or inventing.\n\n3. **Self-realization and Personal Growth:** The pursuit of self-realization and personal development is another common goal in life. This might entail learning new skills, pushing one\'s boundaries, confronting personal obstacles, and evolving as a person.\n\n4. **Ethical and Moral Behavior:** Some believe that the goal of life is to act ethically and morally. This might entail adhering to one\'s moral principles, doing the right thing even when it is difficult, and attempting to make the world a better place.\n\n5. **Spiritual Fulfillment:** For some, the purpose of life is connected to spiritual or religious beliefs. This might entail seeking a connection with a higher power, practicing religious rituals, or following spiritual teachings.\n\n6. **Experiencing Life to the Fullest:** Some individuals believe that the goal of life is to experience all that it has to offer. This might entail traveling, trying new things, taking risks, and embracing new encounters.\n\n7. **Legacy and Impact:** Others believe that the purpose of life is to leave a lasting legacy and impact on the world. This might entail accomplishing something noteworthy, being remembered for one\'s contributions, or inspiring and motivating others.\n\n8. **Finding Balance and Harmony:** For some, the purpose of life is to find balance and harmony in all aspects of their lives. This might entail juggling personal, professional, and social obligations, seeking inner peace and contentment, and living a life that is in accordance with one\'s values and beliefs.\n\nUltimately, the meaning of life is a personal journey, and different individuals may discover their own unique purpose through their experiences, reflections, and interactions with the world around them."
}
role: "model"
}
finish_reason: STOP
index: 0
safety_ratings {
category: HARM_CATEGORY_SEXUALLY_EXPLICIT
probability: NEGLIGIBLE
}
safety_ratings {
category: HARM_CATEGORY_HATE_SPEECH
probability: NEGLIGIBLE
}
safety_ratings {
category: HARM_CATEGORY_HARASSMENT
probability: NEGLIGIBLE
}
safety_ratings {
category: HARM_CATEGORY_DANGEROUS_CONTENT
probability: NEGLIGIBLE
}
]
默认情况下,模型在完成整个生成过程后返回一个响应。您还可以在生成响应时对其进行流式传输,一旦生成响应,模型就会返回响应块。
要流式传输响应,请使用GenerativeModel.generate_content(…,stream=True)。
%%time
response = model.generate_content("What is the meaning of life?", stream=True)
CPU times: user 102 ms, sys: 25.1 ms, total: 128 ms
Wall time: 7.94 s
for chunk in response:
print(chunk.text)
print("_"*80)
The query of life's purpose has perplexed people across centuries, cultures, and
________________________________________________________________________________
continents. While there is no universally recognized response, many ideas have been put forth, and the response is frequently dependent on individual ideas, beliefs, and life experiences
________________________________________________________________________________
.
1. **Happiness and Well-being:** Many individuals believe that the goal of life is to attain personal happiness and well-being. This might entail locating pursuits that provide joy, establishing significant connections, caring for one's physical and mental health, and pursuing personal goals and aspirations.
2. **Meaning
________________________________________________________________________________
ful Contribution:** Some believe that the purpose of life is to make a meaningful contribution to the world. This might entail pursuing a profession that benefits others, engaging in volunteer or charitable activities, generating art or literature, or inventing.
3. **Self-realization and Personal Growth:** The pursuit of self-realization and personal development is another common goal in life. This might entail learning new skills, exploring one's interests and abilities, overcoming obstacles, and becoming the best version of oneself.
4. **Connection and Relationships:** For many individuals, the purpose of life is found in their relationships with others. This might entail building
________________________________________________________________________________
strong bonds with family and friends, fostering a sense of community, and contributing to the well-being of those around them.
5. **Spiritual Fulfillment:** For those with religious or spiritual beliefs, the purpose of life may be centered on seeking spiritual fulfillment or enlightenment. This might entail following religious teachings, engaging in spiritual practices, or seeking a deeper understanding of the divine.
6. **Experiencing the Journey:** Some believe that the purpose of life is simply to experience the journey itself, with all its joys and sorrows. This perspective emphasizes embracing the present moment, appreciating life's experiences, and finding meaning in the act of living itself.
7. **Legacy and Impact:** For others, the goal of life is to leave a lasting legacy or impact on the world. This might entail making a significant contribution to a particular field, leaving a positive mark on future generations, or creating something that will be remembered and cherished long after one's lifetime.
Ultimately, the meaning of life is a personal and subjective question, and there is no single, universally accepted answer. It is about discovering what brings you fulfillment, purpose, and meaning in your own life, and living in accordance with those values.
________________________________________________________________________________
流式传输时,在迭代完所有响应块之后,某些响应属性才可用。如下所示:
response = model.generate_content("What is the meaning of life?", stream=True)
# prompt_feedback属性可以使用
response.prompt_feedback
safety_ratings {
category: HARM_CATEGORY_SEXUALLY_EXPLICIT
probability: NEGLIGIBLE
}
safety_ratings {
category: HARM_CATEGORY_HATE_SPEECH
probability: NEGLIGIBLE
}
safety_ratings {
category: HARM_CATEGORY_HARASSMENT
probability: NEGLIGIBLE
}
safety_ratings {
category: HARM_CATEGORY_DANGEROUS_CONTENT
probability: NEGLIGIBLE
}
# text属性不可用
try:
response.text
except Exception as e:
print(f'{type(e).__name__}: {e}')
IncompleteIterationError: Please let the response complete iteration before accessing the final accumulated
attributes (or call `response.resolve()`)
3)从多模式输入(文本和图像)生成文本响应
Gemini提供了一个多模态模型(Gemini-pro-vision),可以接受文本、图像和输入。GenerativeModel.generate_content API用于处理多模态Prompt并返回文本输出。
下载一张图片:
!curl -o image.jpg https://t0.gstatic.com/licensed-image?q=tbn:ANd9GcQ_Kevbk21QBRy-PgB4kQpS79brbmmEG7m3VOTShAn4PecDU5H5UxrJxE3Dw1JiaG17V88QIol19-3TM2wCHw
查看该图片:
import PIL.Image
img = PIL.Image.open('image.jpg')
img
使用gemini-pro-vision模型,并将图像传递给generate_content模型。
model = genai.GenerativeModel('gemini-pro-vision')
response = model.generate_content(img)
to_markdown(response.text)
Chicken Teriyaki Meal Prep Bowls with brown rice, roasted broccoli and bell peppers.
在Prompt中同时输入文本和图像,需要把字符串和图像以list格式输入:
response = model.generate_content(["Write a short, engaging blog post based on this picture. It should include a description of the meal in the photo and talk about my journey meal prepping.", img], stream=True)
response.resolve()
to_markdown(response.text)
Meal prepping is a great way to save time and money, and it can also help you to eat healthier. This meal is a great example of a healthy and delicious meal that can be easily prepped ahead of time.
This meal features brown rice, roasted vegetables, and chicken teriyaki. The brown rice is a whole grain that is high in fiber and nutrients. The roasted vegetables are a great way to get your daily dose of vitamins and minerals. And the chicken teriyaki is a lean protein source that is also packed with flavor.
This meal is easy to prepare ahead of time. Simply cook the brown rice, roast the vegetables, and cook the chicken teriyaki. Then, divide the meal into individual containers and store them in the refrigerator. When you're ready to eat, simply grab a container and heat it up.
This meal is a great option for busy people who are looking for a healthy and delicious way to eat. It's also a great meal for those who are trying to lose weight or maintain a healthy weight.
If you're looking for a healthy and delicious meal that can be easily prepped ahead of time, this meal is a great option. Give it a try today!
4)使用Gemini进行多轮对话(聊天)
Gemini支持多轮对话。ChatSession类通过管理会话的状态来简化过程,这与generate_content不同,不必将会话历史记录存储为列表。
初始化聊天:
model = genai.GenerativeModel('gemini-pro')
chat = model.start_chat(history=[])
chat
<google.generativeai.generative_models.ChatSession at 0x7b7b68250100>
PS:视觉模型gemini-pro-vision没有针对多轮对话优化。
ChatSession.send_message方法返回与GenerativeModel.generate_content相同的GenerateContentResponse类型,还将您的消息和响应附加到聊天历史记录中:
response = chat.send_message("In one sentence, explain how a computer works to a young child.")
to_markdown(response.text)
# 输出
A computer is like a very smart machine that can understand and follow our instructions, help us with our work, and even play games with us!
chat.history
# 输出
[parts { text: "In one sentence, explain how a computer works to a young child." } role: "user", parts { text: "A computer is like a very smart machine that can understand and follow our instructions, help us with our work, and even play games with us!" } role: "model"]
使用stream=True参数可以设置流式传输聊天,可以持续输入内容:
response = chat.send_message("Okay, how about a more detailed explanation to a high schooler?", stream=True)
for chunk in response:
print(chunk.text)
print("_"*80)
# 输出A computer works by following instructions, called a program, which tells it what to
________________________________________________________________________________
do. These instructions are written in a special language that the computer can understand, and they are stored in the computer's memory. The computer's processor
________________________________________________________________________________
, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program's logic. The results of these calculations and decisions are then displayed on the computer's screen or stored in memory for later use.
To give you a simple analogy, imagine a computer as a
________________________________________________________________________________
chef following a recipe. The recipe is like the program, and the chef's actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).
In summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results.
________________________________________________________________________________
glm.Content对象包含一个glm.Part对象列表。每个glm.Part对象都包含text(字符串)或inline_data(glm.Blob),其中blob包含二进制数据和mime_type。聊天历史记录在以glm.Content列表的形式存储在ChatSession.history对象中:
for message in chat.history:
display(to_markdown(f'**{message.role}**: {message.parts[0].text}'))
# 输出
user: In one sentence, explain how a computer works to a young child.
model: A computer is like a very smart machine that can understand and follow our instructions, help us with our work, and even play games with us!
user: Okay, how about a more detailed explanation to a high schooler?
model: A computer works by following instructions, called a program, which tells it what to do. These instructions are written in a special language that the computer can understand, and they are stored in the computer's memory. The computer's processor, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program's logic. The results of these calculations and decisions are then displayed on the computer's screen or stored in memory for later use.
To give you a simple analogy, imagine a computer as a chef following a recipe. The recipe is like the program, and the chef's actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).
In summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results.
5)使用Gemini进行embedding
使用embed_content方法生成embedding,可以对以下任务(task_type)进行embedding:
Task Type | Description |
---|---|
RETRIEVAL_QUERY | Specifies the given text is a query in a search/retrieval setting. |
RETRIEVAL_DOCUMENT | Specifies the given text is a document in a search/retrieval setting. Using this task type requires a title . |
SEMANTIC_SIMILARITY | Specifies the given text will be used for Semantic Textual Similarity (STS). |
CLASSIFICATION | Specifies that the embeddings will be used for classification. |
CLUSTERING | Specifies that the embeddings will be used for clustering. |
下面为文档检索生成单个字符串的embedding:
result = genai.embed_content(
model="models/embedding-001",
content="What is the meaning of life?",
task_type="retrieval_document",
title="Embedding of single string")
# 1 input > 1 vector output
print(str(result['embedding'])[:50], '... TRIMMED]')
[-0.003216741, -0.013358698, -0.017649598, -0.0091 ... TRIMMED]
PS:retrieval_document任务类型是唯一接受标题的任务。
可以在content中传入一个字符串列表来对字符串进行批处理:
result = genai.embed_content(
model="models/embedding-001",
content=[
'What is the meaning of life?',
'How much wood would a woodchuck chuck?',
'How does the brain work?'],
task_type="retrieval_document",
title="Embedding of list of strings")
# A list of inputs > A list of vectors output
for v in result['embedding']:
print(str(v)[:50], '... TRIMMED ...')
[0.0040260437, 0.004124458, -0.014209415, -0.00183 ... TRIMMED ...
[-0.004049845, -0.0075574904, -0.0073463684, -0.03 ... TRIMMED ...
[0.025310587, -0.0080734305, -0.029902633, 0.01160 ... TRIMMED ...
虽然genai.embedd_content函数接受简单的字符串或字符串列表,但它实际上是围绕glm.Content类型构建的(比如GemerativeModel.generate_content)。glm.Content 在会话API中通常用来初始化会话。
然而,glm.Content对象是多模态的,embedd_content方法只支持文本嵌入。这种设计为API提供了扩展到多模式嵌入的可能性。
response.candidates[0].content
#输出
parts {
text: "A computer works by following instructions, called a program, which tells it what to do. These instructions are written in a special language that the computer can understand, and they are stored in the computer\'s memory. The computer\'s processor, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program\'s logic. The results of these calculations and decisions are then displayed on the computer\'s screen or stored in memory for later use.\n\nTo give you a simple analogy, imagine a computer as a chef following a recipe. The recipe is like the program, and the chef\'s actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).\n\nIn summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results."
}
role: "model"
result = genai.embed_content(
model = 'models/embedding-001',
content = response.candidates[0].content)
# 1 input > 1 vector output
print(str(result['embedding'])[:50], '... TRIMMED ...')
#输出
[-0.013921871, -0.03504407, -0.0051786783, 0.03113 ... TRIMMED ...
同样,聊天历史记录也包含一个glm.Content对象列表。可以直接传递给embed_content函数:
chat.history
# 输出
[parts { text: "In one sentence, explain how a computer works to a young child." } role: "user", parts { text: "A computer is like a very smart machine that can understand and follow our instructions, help us with our work, and even play games with us!" } role: "model", parts { text: "Okay, how about a more detailed explanation to a high schooler?" } role: "user", parts { text: "A computer works by following instructions, called a program, which tells it what to do. These instructions are written in a special language that the computer can understand, and they are stored in the computer\'s memory. The computer\'s processor, or CPU, reads the instructions from memory and carries them out, performing calculations and making decisions based on the program\'s logic. The results of these calculations and decisions are then displayed on the computer\'s screen or stored in memory for later use.\n\nTo give you a simple analogy, imagine a computer as a chef following a recipe. The recipe is like the program, and the chef\'s actions are like the instructions the computer follows. The chef reads the recipe (the program) and performs actions like gathering ingredients (fetching data from memory), mixing them together (performing calculations), and cooking them (processing data). The final dish (the output) is then presented on a plate (the computer screen).\n\nIn summary, a computer works by executing a series of instructions, stored in its memory, to perform calculations, make decisions, and display or store the results." } role: "model"]
result = genai.embed_content(
model = 'models/embedding-001',
content = chat.history)
# 1 input > 1 vector output
for i,v in enumerate(result['embedding']):
print(str(v)[:50], '... TRIMMED...')
[-0.014632266, -0.042202696, -0.015757175, 0.01548 ... TRIMMED...
[-0.010979066, -0.024494737, 0.0092659835, 0.00803 ... TRIMMED...
[-0.010055617, -0.07208932, -0.00011750793, -0.023 ... TRIMMED...
[-0.013921871, -0.03504407, -0.0051786783, 0.03113 ... TRIMMED...
参考文献:
[1] https://ai.google.dev/
[2] makersuite.google.com
[3] https://ai.google.dev/docs?hl=zh-cn
[4] https://colab.research.google.com/github/google/generative-ai-docs/blob/main/site/en/tutorials/python_quickstart.ipynb#scrollTo=GMUvWNkZ11x4