自然语言转SQL,一个微调ChatGPT3.5的实例(上)--训练数据准备

 背景

让非技术人员可以从数据库中提问,这是学术界和工业界多年来感兴趣的问题。最近,大型语言模型(LLM)技术(如GPT-4)的进展提高了所提出解决方案的准确性。然而,由于最先进的LLM尚未开放进行微调,因此最近在这一领域的研究集中在创建能够在不修改基础LLM的情况下实现复杂的自然语言到SQL(NL-to-SQL)场景的检索增强生成(RAG)算法。

上周,OpenAI开放了GPT-3.5-turbo供微调使用。在本文中,我们将微调自己的NL-to-SQL模型,并将其性能与最先进的RAG方法进行比较。我们将使用耶鲁大学的Spider数据集作为测试基准。

以下是来自Spider数据集的两个示例:样例。通过简单的处理,我们可以从这个数据集获得以下训练数据:


[{"query": "SELECT count(*) FROM singer","question": "有多少歌手?"},{"query": "SELECT count(*) FROM singer","question": "歌手的总人数是多少?"}
]

准备训练数据集

对于微调GPT-3.5-Turbo,首先要做的是创建和上传训练数据集。由于GPT-3.5-Turbo是一个ChatModel,因此此数据集必须使用以下格式,并被上传为JSON文件。

{"messages": [{"role": "system", "content": "system_prompt"}, {"role": "user", "content": "user_prompt"}, {"role": "assistant", "content": "assistant_prompt"}]}
{"messages": [{"role": "system", "content": "system_prompt"}, {"role": "user", "content": "user_prompt"}, {"role": "assistant", "content": "assistant_prompt"}]}
{"messages": [{"role": "system", "content": "system_prompt"}, {"role": "user", "content": "user_prompt"}, {"role": "assistant", "content": "assistant_prompt"}]}

创建训练数据集

一个NL-to-SQL任务的定义如下:给定一个问题和数据库,确定一个SQL查询,当对数据库执行该查询时,返回一个结果集,可以回答这个问题。对于如何更好地提示LLMs执行此任务,研究人员探索了各种方法,并普遍认为提示需要包括指令组件、Data Schema的详细信息、关于数据内容、一组任务特定的演示以及实际的问题。

给定ChatModel训练数据的格式,上述元素必须在以下三个提示中呈现:

  • system_prompt – 包含指令、Data Schema和数据库内容

  • user_prompt – 包含自然语言问题

  • assistant_prompt – 包含SQL查询和推理步骤

让我们看看如何为我们的NL-to-SQL训练数据集创建每个提示。

准备 system_prompt

创建system_prompt是这个任务中最复杂的部分。至少,system_prompt需要包括:

  1. 系统指令

  2. Data Schema

  3. 数据内容

此外,对于任何实际的用例,如果有大量的表,则训练集中的样本还应该训练模型选择正确的表格用于SQL查询(即执行模式链接)。

准备 系统指令

对于指令,我们使用了以下标准提示:

You are an assistant that is an expert in generating Sqlite SQL queries.
Having the access to database content, generate a correct Sqlite SQL query for the given question.
### Database content ###
Data Schema

对于Data Schema,文献中有许多提出的格式,但没有明确的共识,哪种格式表现最佳。我们发现以下是Data Schema的最佳表示方式:


CREATE TABLE concert (
"concert_ID" INTEGER NOT NULL,
"concert_Name" TEXT NOT NULL,
"Theme" TEXT,
"Stadium_ID" TEXT NOT NULL,
"Year" TEXT,
PRIMARY KEY ("concert_ID"),
FOREIGN KEY("Stadium_ID")
REFERENCES stadium ("Stadium_ID")
)
CREATE TABLE singer (
"Singer_ID" INTEGER NOT NULL,
"Name" TEXT,
"Country" TEXT NOT NULL,
"Song_Name" TEXT NOT NULL,
"Song_release_year" TEXT,
"Age" INTEGER,
"Is_male" BOOLEAN NOT NULL,
PRIMARY KEY ("Singer_ID")
)
数据内容

经过多次实验,我们发现以下模板在最佳的训练模型数据内容方面表现最好:

/*
Columns in concert and 3 examples in each column for the high cardinality columns :
concert_ID: 1025 , 1101 , 1247
concert_Name : "Fire", "Dance", "Sky"
Stadium_ID : 9, 10, 11
*/
/*
Columns in concert and all categories for the low cardinality columns :
Theme : " ROCK ", " POP ", " HIP-HOP "
Year : 2022, 2021, 2023, 2020
*/
/*
Columns in singer and 3 examples in each column for the high cardinality columns :
Singer_ID : 10235 , 110231 , 1242447
Name : "Jordan", "Gabriel", "Tiffany"
Country : "Iran", "India", "Canada"
Song_Name : "dance in the fire", "rain", "sky"
Age : 19, 20, 21
*/
/*
Columns in singer and all categories for the low cardinality columns :
Is_male : "MALE", "FEMALE",
Song_release_year : 2022, 2021, 2023, 2020
*/

数据库内容中的一个重要元素是如何识别分类(低基数)列。区分低基数和高基数列的阈值取决于要微调的大型语言模型(LLM)的上下文窗口大小。给定GPT-3.5-turbo的4096令牌上下文窗口,我们确定20个令牌是低基数和高基数列之间适当的阈值。

模式链接

为了为我们训练集创建正确的system_prompt,需要以这样的方式提供样本,以便训练模型在数据库上正确执行模式链接。为此,我们采用了以下启发式方法:对于每个单独的NL <> SQL样本,我们除了正确的表之外,还随机选择了数据库中的其他表,直到达到4000个令牌的上下文窗口限制为止。为了减少位置信息的影响,我们进一步随机化了表的顺序。简而言之,每个system_prompt包括相关表的模式和内容与其他不相关的表混合在一起,帮助训练模型选择查询的正确表。

现在,我们将所有这些放在一起,构建我们的system_prompt。

对于来自Spider的以下示例:

question:"How many heads of the departments are older than 56?"
SQL: "SELECT count(*) FROM head WHERE age > 56"

system_prompt将是

You are an assistant that is an expert in generating Sqlite SQL queries.
Having the access to database content, generate a correct Sqlite SQL query for the given question.
### Database content ###CREATE TABLE trip (
id INTEGER, duration INTEGER,
start_date TEXT,
start_station_name TEXT,
start_station_id INTEGER,
end_date TEXT,
end_station_name TEXT,
end_station_id INTEGER,
bike_id INTEGER,
subscription_type TEXT,
zip_code INTEGER,
PRIMARY KEY (id)
)
/* Columns in trip and 3 examples in each column for high cardinality columns :
id : 900645, 900752, 900524
duration : 1131, 2146, 1155
start_date : 8/21/2015 17:39, 8/21/2015 17:03, 8/21/2015 17:16
start_station_name : Howard at 2nd, 2nd at Folsom, Market at 10th
start_station_id : 56, 65, 49
end_date : 8/21/2015 17:19, 8/21/2015 18:08, 8/21/2015 17:32
end_station_name : Howard at 2nd, 2nd at Folsom, Market at 10th
end_station_id : 56, 65, 49
bike_id : 586, 56, 65
zip_code : 94070, 94530, 94040–1724
*/
/* Columns in trip and all categories for low cardinality columns :
subscription_type : Customer, Subscriber
*/CREATE TABLE management (
"department_ID" INTEGER,
"head_ID" INTEGER,
temporary_acting TEXT,
PRIMARY KEY ("department_ID", "head_ID"),
FOREIGN KEY("head_ID") REFERENCES head ("head_ID"),
FOREIGN KEY("department_ID") REFERENCES department ("Department_ID")
)
/* Columns in management and all categories for low cardinality columns :
department_ID : 7, 15, 2, 11
head_ID : 5, 4, 6, 3, 10
temporary_acting : Yes, No
*/CREATE TABLE department (
"Department_ID" INTEGER,
"Name" TEXT,
"Creation" TEXT,
"Ranking" INTEGER,
"Budget_in_Billions" REAL,
"Num_Employees" REAL,
PRIMARY KEY ("Department_ID")
)
/* Columns in department and 3 examples in each column for high cardinality columns :
Department_ID : 1, 13, 11
Name : Energy, Interior, Health and Human Services
Creation : 1913, 1979, 1989
Ranking : 1, 13, 11
Budget_in_Billions : 10.7, 77.6, 59.7
Num_Employees : 112557.0, 3000000.0, 235000.0
*/...
CREATE TABLE head (
"head_ID" INTEGER,
name TEXT,
born_state TEXT,
age REAL,
PRIMARY KEY ("head_ID")
)
/* Columns in head and all categories for low cardinality columns :
head_ID : 1, 2, 5, 7, 8, 4, 6, 3, 10, 9
name : Jeff Maggert, Pádraig Harrington, Billy Mayfair, K. J. Choi, Dudley Hart, Sergio García, Stewart Cink, Tiger Woods, Nick Faldo, Franklin Langham
born_state : Delaware, Connecticut, Alabama, California, Florida
age : 69.0, 67.0, 68.0, 53.0, 56.0, 52.0, 50.0, 43.0
*/...

user_prompt

用户提示很简单,每个Spider样本的用户问题。例如:

How many heads of the departments are older than 56?

assistant_prompt

助理提示也很简单,包含了来自Spider的相关SQL查询和找到正确列和正确表的推理步骤。为了构建推理步骤,我们只需提取在SQL查询中使用的表和列。例如:

To construct the query, I'll be working with the following tables: head.
From these tables, I'll be using the following columns: age.
The SQL query I'll be generating is:
SELECT count(*) FROM head WHERE age > 56

最终的数据格式

{"messages": [{"role": "system","content": "\nYou are an assistant that is an expert in generating sqlite SQL queries.\nHaving the access to database content, generate a correct sqlite SQL query for the given question.\n### Database content ###\n        \nCREATE TABLE trip (\n\tid INTEGER, \n\tduration INTEGER, \n\tstart_date TEXT, \n\tstart_station_name TEXT, \n\tstart_station_id INTEGER, \n\tend_date TEXT, \n\tend_station_name TEXT, \n\tend_station_id INTEGER, \n\tbike_id INTEGER, \n\tsubscription_type TEXT, \n\tzip_code INTEGER, \n\tPRIMARY KEY (id)\n)\n/*\nColumns in trip and 3 examples in each column for high cardinality columns :\nid : 900645, 900752, 900524\nduration : 1131, 2146, 1155\nstart_date : 8/21/2015 17:39, 8/21/2015 17:03, 8/21/2015 17:16\nstart_station_name : Howard at 2nd, 2nd at Folsom, Market at 10th\nstart_station_id : 56, 65, 49\nend_date : 8/21/2015 17:19, 8/21/2015 18:08, 8/21/2015 17:32\nend_station_name : Howard at 2nd, 2nd at Folsom, Market at 10th\nend_station_id : 56, 65, 49\nbike_id : 586, 56, 65\nzip_code : 94070, 94530, 94040-1724\n*/\n/*\nColumns in trip and all categories for low cardinality columns :\nsubscription_type : Customer, Subscriber\n*/\n \nCREATE TABLE \"Problems\" (\n\tproblem_id INTEGER, \n\tproduct_id INTEGER NOT NULL, \n\tclosure_authorised_by_staff_id INTEGER NOT NULL, \n\treported_by_staff_id INTEGER NOT NULL, \n\tdate_problem_reported DATETIME NOT NULL, \n\tdate_problem_closed DATETIME, \n\tproblem_description VARCHAR(255), \n\tother_problem_details VARCHAR(255), \n\tPRIMARY KEY (problem_id), \n\tFOREIGN KEY(reported_by_staff_id) REFERENCES \"Staff\" (staff_id), \n\tFOREIGN KEY(product_id) REFERENCES \"Product\" (product_id), \n\tFOREIGN KEY(closure_authorised_by_staff_id) REFERENCES \"Staff\" (staff_id)\n)\n/*\nColumns in Problems and 3 examples in each column for high cardinality columns :\nproblem_id : 1, 13, 11\nclosure_authorised_by_staff_id : 1, 13, 2\ndate_problem_reported : 1995-05-14 08:32:56, 1988-11-07 16:09:31, 1986-11-13 07:30:55\ndate_problem_closed : 1974-09-20 13:42:19, 1997-10-18 20:09:57, 2004-06-20 01:08:25\nproblem_description : d, i, s\n*/\n/*\nColumns in Problems and all categories for low cardinality columns :\nproduct_id : 1, 13, 2, 5, 7, 8, 4, 6, 15\nreported_by_staff_id : 1, 13, 11, 2, 5, 7, 4, 14, 10\nother_problem_details : f, m, i, s, k, l, p, v, c\n*/\n \nCREATE TABLE management (\n\t\"department_ID\" INTEGER, \n\t\"head_ID\" INTEGER, \n\ttemporary_acting TEXT, \n\tPRIMARY KEY (\"department_ID\", \"head_ID\"), \n\tFOREIGN KEY(\"head_ID\") REFERENCES head (\"head_ID\"), \n\tFOREIGN KEY(\"department_ID\") REFERENCES department (\"Department_ID\")\n)\n/*\nColumns in management and all categories for low cardinality columns :\ndepartment_ID : 7, 15, 2, 11\nhead_ID : 5, 4, 6, 3, 10\ntemporary_acting : Yes, No\n*/\n \nCREATE TABLE category (\n\tcategory_id INTEGER NOT NULL, \n\tname VARCHAR(25) NOT NULL, \n\tlast_update TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL, \n\tPRIMARY KEY (category_id)\n)\n/*\nColumns in category and 3 examples in each column for high cardinality columns :\ncategory_id : 1, 16, 13\nname : Family, Sci-Fi, Action\n*/\n/*\nColumns in category and all categories for low cardinality columns :\nlast_update : 2006-02-15 04:46:27\n*/\n \nCREATE TABLE ship (\n\t\"Ship_ID\" INTEGER, \n\t\"Name\" TEXT, \n\t\"Type\" TEXT, \n\t\"Nationality\" TEXT, \n\t\"Tonnage\" INTEGER, \n\tPRIMARY KEY (\"Ship_ID\")\n)\n/*\nColumns in ship and all categories for low cardinality columns :\nShip_ID : 1, 2, 5, 7, 8, 4, 6, 3\nName : Clan McTavish, Farringford, Appam, Author, Dromonby, Corbridge, Trader, Ariadne\nType : Battle ship, Cargo ship\nNationality : United States, United Kingdom\nTonnage : 3035, 3146, 7781, 3496, 3687, 5816, 3627, 3608\n*/\n \nCREATE TABLE member_attendance (\n\t\"Member_ID\" INTEGER, \n\t\"Performance_ID\" INTEGER, \n\t\"Num_of_Pieces\" INTEGER, \n\tPRIMARY KEY (\"Member_ID\", \"Performance_ID\"), \n\tFOREIGN KEY(\"Performance_ID\") REFERENCES performance (\"Performance_ID\"), \n\tFOREIGN KEY(\"Member_ID\") REFERENCES member (\"Member_ID\")\n)\n/*\nColumns in member_attendance and all categories for low cardinality columns :\nMember_ID : 1, 11, 2, 5, 7, 4, 3\nPerformance_ID : 1, 2, 4, 6, 3\nNum_of_Pieces : 1, 2, 4, 3\n*/\n \nCREATE TABLE department (\n\t\"Department_ID\" INTEGER, \n\t\"Name\" TEXT, \n\t\"Creation\" TEXT, \n\t\"Ranking\" INTEGER, \n\t\"Budget_in_Billions\" REAL, \n\t\"Num_Employees\" REAL, \n\tPRIMARY KEY (\"Department_ID\")\n)\n/*\nColumns in department and 3 examples in each column for high cardinality columns :\nDepartment_ID : 1, 13, 11\nName : Energy, Interior, Health and Human Services\nCreation : 1913, 1979, 1989\nRanking : 1, 13, 11\nBudget_in_Billions : 10.7, 77.6, 59.7\nNum_Employees : 112557.0, 3000000.0, 235000.0\n*/\n\n \nCREATE TABLE chip_model (\n\t\"Model_name\" TEXT, \n\t\"Launch_year\" REAL, \n\t\"RAM_MiB\" REAL, \n\t\"ROM_MiB\" REAL, \n\t\"Slots\" TEXT, \n\t\"WiFi\" TEXT, \n\t\"Bluetooth\" TEXT, \n\tPRIMARY KEY (\"Model_name\")\n)\n/*\nColumns in chip_model and 3 examples in each column for high cardinality columns :\nModel_name : X30 mid-range, X50 Advanced, X51 mid-range\n*/\n/*\nColumns in chip_model and all categories for low cardinality columns :\nLaunch_year : 2002.0, 2005.0, 2004.0, 2003.0\nRAM_MiB : 32.0, 64.0\nROM_MiB : 48.0, 256.0, 128.0, 32.0, 64.0\nSlots : 1CFII,1SD, 1SD\nWiFi : 802.11b, No\nBluetooth : 1.2, Yes, No, 1.1\n*/\n \nCREATE TABLE head (\n\t\"head_ID\" INTEGER, \n\tname TEXT, \n\tborn_state TEXT, \n\tage REAL, \n\tPRIMARY KEY (\"head_ID\")\n)\n/*\nColumns in head and all categories for low cardinality columns :\nhead_ID : 1, 2, 5, 7, 8, 4, 6, 3, 10, 9\nname : Jeff Maggert, Pádraig Harrington, Billy Mayfair, K. J. Choi, Dudley Hart, Sergio García, Stewart Cink, Tiger Woods, Nick Faldo, Franklin Langham\nborn_state : Delaware, Connecticut, Alabama, California, Florida\nage : 69.0, 67.0, 68.0, 53.0, 56.0, 52.0, 50.0, 43.0\n*/\n \nCREATE TABLE mountain (\n\t\"Mountain_ID\" INTEGER, \n\t\"Name\" TEXT, \n\t\"Height\" REAL, \n\t\"Prominence\" REAL, \n\t\"Range\" TEXT, \n\t\"Country\" TEXT, \n\tPRIMARY KEY (\"Mountain_ID\")\n)\n/*\nColumns in mountain and all categories for low cardinality columns :\nMountain_ID : 1, 2, 5, 7, 4, 6, 3\nName : Ngaliema / Mt Stanley (Margherita Pk), Mount Kenya (Lenana), Kibo (Uhuru Pk), Ngaliema / Mt Stanley (Savoia Pk), Mount Kenya (Batian), Duwoni / Mt Speke (Vittorio Emanuele Pk), Mawenzi (Hans Meyer Pk)\nHeight : 5109.0, 5199.0, 5895.0, 4890.0, 4985.0, 4977.0, 5148.0\nProminence : 720.0, 850.0, 3951.0, 3825.0, 130.0, 5885.0, 110.0\nRange : Kilimanjaro, Mount Kenya, Rwenzori\nCountry : DR Congo Uganda, Uganda, Tanzania, Kenya\n*/\n \nCREATE TABLE \"Restaurant_Type\" (\n\t\"ResTypeID\" INTEGER, \n\t\"ResTypeName\" VARCHAR(40), \n\t\"ResTypeDescription\" VARCHAR(100), \n\tPRIMARY KEY (\"ResTypeID\")\n)\n/*\nColumns in Restaurant_Type and all categories for low cardinality columns :\nResTypeID : 1, 2\nResTypeName : Sandwich, Stir-fry\nResTypeDescription : Classic Chinese cooking., Simplest there is.\n*/\n \nCREATE TABLE farm_competition (\n\t\"Competition_ID\" INTEGER, \n\t\"Year\" INTEGER, \n\t\"Theme\" TEXT, \n\t\"Host_city_ID\" INTEGER, \n\t\"Hosts\" TEXT, \n\tPRIMARY KEY (\"Competition_ID\"), \n\tFOREIGN KEY(\"Host_city_ID\") REFERENCES city (\"City_ID\")\n)\n/*\nColumns in farm_competition and all categories for low cardinality columns :\nCompetition_ID : 1, 2, 5, 4, 6, 3\nYear : 2004, 2013, 2005, 2006, 2003, 2002\nTheme : MTV Cube, Valentine's Day, Codehunters, Carnival M is back!, Aliens, MTV Asia Aid\nHost_city_ID : 1, 2, 5, 4, 3\nHosts : Mandy Moore and Ronan Keating, Alicia Keys, Shaggy and Coco Lee, Leehom Wang and Kelly Rowland, Miley Cyrus Jared Leto and Karen Mok, Vanness Wu and Michelle Branch\n*/\n \nCREATE TABLE \"Country\" (\n\tid INTEGER, \n\tname TEXT, \n\tPRIMARY KEY (id)\n)\n/*\nColumns in Country and 3 examples in each column for high cardinality columns :\nid : 1, 19694, 7809\nname : Scotland, Italy, Spain\n*/\n\n \nCREATE TABLE artist (\n\tartist_name TEXT(50) NOT NULL, \n\tcountry TEXT(20), \n\tgender TEXT(20), \n\tpreferred_genre TEXT(50), \n\tCONSTRAINT a_name PRIMARY KEY (artist_name), \n\tFOREIGN KEY(preferred_genre) REFERENCES genre (g_name) ON DELETE CASCADE\n)\n/*\nColumns in artist and all categories for low cardinality columns :\nartist_name : Prity, Michel, Topu, Shrikanta, Enrique, Farida\ncountry : India, UK, USA, Bangladesh\ngender : Male, Female\npreferred_genre : tagore, folk, modern, nazrul, blues, pop\n*/\n \nCREATE TABLE \"Organizations\" (\n\torganization_id INTEGER NOT NULL, \n\tparent_organization_id INTEGER, \n\torganization_details VARCHAR(255), \n\tPRIMARY KEY (organization_id)\n)\n/*\nColumns in Organizations and all categories for low cardinality columns :\norganization_id : 7, 8, 10\nparent_organization_id : 7, 8\norganization_details : Denesik and Sons Party, Reinger, Hudson and Nolan Group, Robel-Schulist Group\n*/\n \nCREATE TABLE school (\n\t\"School_ID\" INTEGER, \n\t\"School\" TEXT, \n\t\"Location\" TEXT, \n\t\"Enrollment\" REAL, \n\t\"Founded\" REAL, \n\t\"Denomination\" TEXT, \n\t\"Boys_or_Girls\" TEXT, \n\t\"Day_or_Boarding\" TEXT, \n\t\"Year_Entered_Competition\" REAL, \n\t\"School_Colors\" TEXT, \n\tPRIMARY KEY (\"School_ID\")\n)\n/*\nColumns in school and all categories for low cardinality columns :\nSchool_ID : 1, 2, 5, 4, 6, 3\nSchool : St Aloysius' College, Cranbrook School, Waverley College, Knox Grammar School, Barker College, Trinity Grammar School\nLocation : Hornsby, Summer Hill, Waverley, Bellevue Hill, Milsons Point, Wahroonga\nEnrollment : 1000.0, 1850.0, 2200.0, 1200.0, 2300.0, 1430.0\nFounded : 1918.0, 1924.0, 1913.0, 1879.0, 1903.0, 1890.0\nDenomination : Catholic, Uniting Church, Anglican\nBoys_or_Girls : Boys only to Yr 9 Co-ed Year 10 to 12, Boys\nDay_or_Boarding : Day, Day & Boarding\nYear_Entered_Competition : 1944.0, 1929.0\nSchool_Colors : Royal Blue and Gold, Black & Blue, Red, White & Blue, Red & Blue, Green and White\n*/\n \nCREATE TABLE flight (\n\tid INTEGER, \n\t\"Vehicle_Flight_number\" TEXT, \n\t\"Date\" TEXT, \n\t\"Pilot\" TEXT, \n\t\"Velocity\" REAL, \n\t\"Altitude\" REAL, \n\tairport_id INTEGER, \n\tcompany_id INTEGER, \n\tPRIMARY KEY (id), \n\tFOREIGN KEY(company_id) REFERENCES operate_company (id), \n\tFOREIGN KEY(airport_id) REFERENCES airport (id)\n)\n/*\nColumns in flight and 3 examples in each column for high cardinality columns :\nid : 1, 13, 11\nVehicle_Flight_number : M2-F1 #14, M2-F1 #61, M2-F1 #0\nDate : July 16, 1965, May 19, 1964, March 28, 1966\n*/\n/*\nColumns in flight and all categories for low cardinality columns :\nPilot : Thompson, Peterson\nVelocity : 240.0, 135.0\nAltitude : 3650.0, 0.0\nairport_id : 1, 2, 5, 8, 4, 6, 3, 9\ncompany_id : 1, 13, 11, 2, 5, 7, 4, 6, 3, 9\n*/\n \nCREATE TABLE \"Type_Of_Restaurant\" (\n\t\"ResID\" INTEGER, \n\t\"ResTypeID\" INTEGER, \n\tFOREIGN KEY(\"ResID\") REFERENCES \"Restaurant\" (\"ResID\"), \n\tFOREIGN KEY(\"ResTypeID\") REFERENCES \"Restaurant_Type\" (\"ResTypeID\")\n)\n/*\nColumns in Type_Of_Restaurant and all categories for low cardinality columns :\nResID : 1, 2\nResTypeID : 1, 2\n*/\n \nCREATE TABLE journalist (\n\t\"journalist_ID\" INTEGER, \n\t\"Name\" TEXT, \n\t\"Nationality\" TEXT, \n\t\"Age\" TEXT, \n\t\"Years_working\" INTEGER, \n\tPRIMARY KEY (\"journalist_ID\")\n)\n/*\nColumns in journalist and 3 examples in each column for high cardinality columns :\njournalist_ID : 1, 11, 2\nName : Jack Meaney, Herbert Swindells, Jackie Waring\n*/\n/*\nColumns in journalist and all categories for low cardinality columns :\nNationality : Northern Ireland, Wales, England\nAge : 37, 28, 25, 33, 34, 43, 27, 29\nYears_working : 1, 5, 7, 8, 21, 6, 3, 12, 10, 9\n*/\n \nCREATE TABLE list (\n\t\"LastName\" TEXT, \n\t\"FirstName\" TEXT, \n\t\"Grade\" INTEGER, \n\t\"Classroom\" INTEGER, \n\tPRIMARY KEY (\"LastName\", \"FirstName\")\n)\n/*\nColumns in list and 3 examples in each column for high cardinality columns :\nLastName : HOUTCHENS, GELL, FLACHS\nFirstName :  RAY,  EMILE,  PATRINA\nClassroom : 109, 110, 106\n*/\n/*\nColumns in list and all categories for low cardinality columns :\nGrade : 1, 2, 5, 4, 6, 3, 0\n*/\n"},{"role": "user","content": "How many heads of the departments are older than 56 ?"},{"role": "assistant","content": "\nTo construct the query, I'll be working with the following tables: head.\nFrom these tables, I'll be using the following columns: age.\nThe SQL query I'll be generating is:\nSELECT count(*) FROM head WHERE age  >  56"}]
}

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/597267.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

select for update会锁表还是行锁还是其它

select for update含义 select查询语句是不会加锁的&#xff0c;但是 select for update除了有查询的作用外&#xff0c;还会加锁呢&#xff0c;而且它是悲观锁哦。至于加了是行锁还是表锁&#xff0c;这就要看是不是用了索引/主键啦。 没用索引/主键的话就是表锁&#xff0c…

双曲正弦函数(*) 优化麦克劳林公式

#include<stdio.h> #include<math.h> int main() {double x,eps,i3,y,item;scanf("%lf%lf",&x,&eps);yx;itemx;while(fabs(item)>eps){itemitem*x*x/i/(i-1);i2;yitem;}printf("%.6f\n",y);return 0; }

SpringCloud-高级篇(十)

&#xff08;1&#xff09;单节点Redis问题 缓存大家都不陌生&#xff0c;在企业里面只要做缓存都会用到Redis&#xff0c;我们在使用的时候都是做的单节点部署&#xff0c;单节点部署是存在一些问题的&#xff0c;分布式缓存正是Redis的集群&#xff0c;正是为了解决单节点部署…

Vue实现模糊查询

在Vue中实现模糊查询&#xff0c;你可以使用JavaScript的filter和includes方法&#xff0c;结合Vue的v-for指令。下面是一个简单的例子&#xff1a; 首先&#xff0c;你需要在你的Vue实例中定义一个数据数组和一个查询字符串。 data() { return { items: [Apple, Banana, Che…

Windows下多个JDK版本快速切换

Windows下多个JDK版本快速切换 本文以windows11为例&#xff0c;安装了jdk1.8\jdk11\jdk17三个版本。 1 安装junction工具 需要借助junction工具创建软链接 下载地址&#xff1a;https://docs.microsoft.com/zh-cn/sysinternals/downloads/junction 下载后&#xff0c;一般…

大数据毕业设计:租房推荐系统 python 租房大数据 爬虫+可视化大屏 计算机毕业设计(附源码+文档)✅

毕业设计&#xff1a;2023-2024年计算机专业毕业设计选题汇总&#xff08;建议收藏&#xff09; 毕业设计&#xff1a;2023-2024年最新最全计算机专业毕设选题推荐汇总 &#x1f345;感兴趣的可以先收藏起来&#xff0c;点赞、关注不迷路&#xff0c;大家在毕设选题&#xff…

广播及代码实现

广播&#xff08;Broadcast&#xff09;是一种网络通信方式&#xff0c;它允许一台设备向网络中的所有其他设备发送消息。广播通常用于在网络上传递一些信息&#xff0c;让所有设备都能接收并处理。在广播中&#xff0c;通信的目标是整个网络而不是特定的单个设备。 向子网中…

【JAVA】Java 中 Set集合常用方法

&#x1f34e;个人博客&#xff1a;个人主页 &#x1f3c6;个人专栏&#xff1a; JAVA ⛳️ 功不唐捐&#xff0c;玉汝于成 目录 前言 正文 常用方法 代码示例 结语 我的其他博客 前言 Java中的Set接口提供了一种不允许包含重复元素的集合。常用的实现类有HashS…

把苹果手机上的备忘录转为长图片,分享给别人方法教程

在这个信息爆炸的时代&#xff0c;手机备忘录几乎成了我随身携带的“记忆宝库”。每当我脑海中闪现出一个想法、灵感或是需要记住的重要事项&#xff0c;我都会第一时间打开苹果手机的备忘录&#xff0c;将它们一一记录下来。备忘录的简洁界面和高效操作总能让我在忙碌的生活中…

正负样本分配策略simOTA

simOTA是YOLOX中提出的 正负样本分配策略&#xff08;OTA, SimOTA&#xff0c;TAS&#xff09; OTA源于2021年cvpr的论文&#xff0c;使训练和验证的标签有着更好的对应关系。 yolov5没有用到&#xff0c;只有一种loss&#xff1a; from utils.loss import ComputeLoss comput…

线性代数-第五课,第六课,第七课,第八课

第五课 判断某向量是否可由某向量组线性表示 把向量组组成一个行列式&#xff0c;计算行列式的秩 把所有向量放在一起构成一个行列式&#xff0c;计算行列式的秩 如果两个行列式的秩相等&#xff0c;表示可以线性表示&#xff0c;写答案的格式如下 线性表示&#xff1a;bk…

大模型应用实践:AIGC探索之旅

随着OpenAI推出ChatGPT&#xff0c;AIGC迎来了前所未有的发展机遇。大模型技术已经不仅仅是技术趋势&#xff0c;而是深刻地塑造着我们交流、工作和思考的方式。 本文介绍了笔者理解的大模型和AIGC的密切联系&#xff0c;从历史沿革到实际应用案例&#xff0c;再到面临的技术挑…

第28关 k8s监控实战之Prometheus(二)

------> 课程视频同步分享在今日头条和B站 大家好&#xff0c;我是博哥爱运维。 这节课我们用prometheus-operator来安装整套prometheus服务 https://github.com/prometheus-operator/kube-prometheus/releases 开始安装 1. 解压下载的代码包 wget https://github.com/…

二、串行FLASH文件系统FatFs移植

经过上一节的分析&#xff0c;我们对文件系统有一定的理解了&#xff0c;这一节给大家介绍怎么把FatFs文件系统的这些代码移植到STM32S上&#xff0c;然后STM32利用这一些代码或者函数&#xff0c;以文件的格式对FLASH进行读写数据。 实则对diskio.c提供一些函数接口。 首先将…

Kubernetes-网络

一. 前言 flannel两种容器跨主机通信的方案&#xff0c;其中UDP模式是IP in UDP&#xff0c;即三层报文封装在UDP数据包中通信&#xff1b;而vxlan模式则是MAC in UDP&#xff0c;即二层报文封装在UDP数据包中通信 flannel UDP模式和vxlan模式都对数据包做了封解包&#xff0c…

手把手教你在Ubuntu22上安装VideoRetalking

VideoReTalking是一种新系统&#xff0c;可以根据输入音频编辑真实世界的谈话头部视频的面孔&#xff0c;即使具有不同的情感&#xff0c;也能生成高质量和口型同步的输出视频。我们的系统将这个目标分解为三个连续的任务&#xff1a; &#xff08;1&#xff09;具有规范表情的…

SpringBoot实现登录拦截器

SpringBoot实现登录拦截器 对于管理系统或其他需要用户登录的系统&#xff0c;登录验证都是必不可少的环节&#xff0c;在SpringBoot开发的项目中&#xff0c;通过实现拦截器来实现用户登录拦截并验证。 1、SpringBoot实现登录拦截的原理 SpringBoot通过实现HandlerIntercep…

一篇文章学会如何在 NestJS 中集成 MongoDB 并实现数据的增删改查操作

前言 在现代的Web应用程序开发中&#xff0c;无论是在数据存储、检索、还是数据流转的各个环节&#xff0c;数据库都扮演着极其重要的角色。MongoDB是一个基于分布式文件存储的开源数据库系统&#xff0c;以其高性能、高可用性和易扩展性著称。 作为JavaScript社区最受欢迎的…

java正则表达式大全(参考)

一、校验数字的表达式 1 数字&#xff1a;1$ 2 n位的数字&#xff1a;^\d{n}$ 3 至少n位的数字&#xff1a;^\d{n,}$ 4 m-n位的数字&#xff1a;^\d{m,n}$ 5 零和非零开头的数字&#xff1a;^(0|[1-9][0-9])$ 6 非零开头的最多带两位小数的数字&#xff1a;^([1-9][0-9])(.[0-…

普中STM32-PZ6806L开发板(HAL库函数实现-访问多个温度传感器DS18B20)

简介 我们知道多个DS18B20的DQ线是可以被挂在一起的, 也就是一根线上可以访问不同的DS18B20而不会造成数据错乱, 怎么做到的&#xff0c;其实数据手册都有说到&#xff0c; 就是靠64-bit ROM code 进行识别, 也可以理解成Serial Number进行识别, 因为主要差异还是在Serial Numb…