在这篇文章中,我将介绍如何存储每个机器学习训练和测试过程的元数据,为元学习算法铺平道路。
一、元学习任务介绍
元学习是利用以前获得的和分类的学习过程的元数据来处理机器学习算法以前没有遇到过的新的学习任务。元学习的任务就是学会学习。
深度学习模型今天无法完成多种不同的任务。例如,我在过去几周展示了一个图像分类器。虽然对这个特定任务调整超参数很容易,但我并没有制作出一个有效的图像分类模型。这花费了我的时间,我只是展示了如何使这个过程更自动化。即使我创建了优化良好的卷积神经网络,我仍然会面临对象分类范围有限的问题。我还将通过使用强化学习来解决这个问题,使模型不断接受训练,并在需要时获取其迭代。但这并不能解决如何处理外观相似的对象的问题。我会使用深度学习训练过程的前一个子集来获得有关这个问题的知识,并在训练过程中应用一个公式。
这个列表还可以继续下去。但它没有解决一个问题。我怎样才能减少调整模型的时间?这就是元学习成为救星的地方。例如,元学习方法,如学习问题,可以被应用来大大减少创建模型所需的时间。
但是,怎么做呢?我们需要深入探讨我们人类如何首先感知这些问题,以便将其应用于元学习。例如,我们怎么知道机器学习训练过程出了问题?通过观察损失函数。我们之所以查看损失函数,是因为它是深度神经网络在特定任务中表现如何的衍生元数据。我们怎么能确定这一点?结果会给我们提供统计上更高的预测率。这些知识本身就足以让我们了解基于优化的元学习技术应该如何运作,通过观察优化对损失函数的有效性。
存储深度学习训练过程及其随后的准确性结果在一个对象中,以便使用元学习方法进行交叉比较,也是至关重要的。
二、元学习需要什么?
让我们来分析一下三种常见的元学习方法,看看它们需要什么:
**基于模型的元学习:**提出的模型使用机器学习过程的内部或外部记忆来实现更好的学习。这意味着,如果你要分类狗,如果你已经分类了其他狗,你可以自动使用之前用来实现这个目标的模型。这种集中的元学习方法的缺点是需要标记对象。
**基于指标的元学习:**提出的模型使用不同的指标来决定学习任务在其过程中是否相似。如果你必须在人类和鸟类之间进行分类,你可以使用之前获得的哺乳动物和鸟类分类的元数据来取得好的结果。然而,这种元学习类型的缺点是包括蝙蝠在内。一种看起来像鸟类的哺乳动物。
**基于优化的元学习:**提出的模型使用先前获得的深度学习训练的优化超参数的元数据,以利用元学习最大化结果。这种纯粹的元学习方法将需要一个高强度的机器学习过程。
现在,为了对抗基于模型的元学习过程的弱点,我在我的一篇旧博客文章中已经提出了一个解决方案。我使用了SerpApi的Google Images Scraper API,通过chips参数使用特定标签抓取图像,以创建大规模数据集(也只使用特定大小的图像来自动化预处理)。
我没有完全解决度量基元学习弱点的方案。然而,我注意到许多搜索引擎趋向于丰富它们的知识图谱、答案框和相关搜索项,如相关问题、相关搜索等,这可能有助于类似于新任务的连接。当然,这是一个模糊的想法。你可以查看SerpApi的Google Knowledge Graph Scraper API、SerpApi的Google Answer Box Scraper API和其他相关文档的文档和示例,以更好地了解如何在元学习中利用它们。你还可以注册以获得免费积分。
我也没有解决优化基元学习弱点的方案。然而,本周我将展示如何使用异步调用存储机器学习训练过程,这对这种元学习方法至关重要。计算机科学中的异步处理指的是分布式任务,这些任务在不影响彼此进度的同时运行。它使我们不必等待训练过程结束,就可以在这种情况下运行多个调用。
拥有良好的机器学习框架,能够在外部对象或数据点中存储训练示例并进行比较,并为未来使用存储训练数据的元数据是至关重要的。将SGD(随机梯度下降)、RNN(递归神经网络)、回归、少量学习等集中在一个地方,并使用通用语法实现转移学习,这也非常有用。这些博客系列的目标是至少实现在这篇博客文章中提出的一部分内容。一旦它在SerpApi的Github页面上开源,我希望我的错误(特别是在前端)会在其他人的帮助下得到弥补。就像现实世界中表现最佳的程序员一样,目标是在训练特定问题的模型时尽量减少定制的需要,并使训练出来的模型能够执行多任务操作。当然,在初始化时,我不指望一个可以执行多个分类任务的监督学习模型能够写出一首诗。但通过人类观察,交叉比较不同模型参数的基准,至少进行元训练的能力对于像我这样试图获得新技能的人来说是一个激动人心的步骤。
三、存储机器学习模型
我已经创建了一个尝试项,将其存储在模型范围下的存储服务器中。
class Attempt(BaseModel):id: int | None = Nonename: str | None = Nonetraining_commands: dict = {}training_losses: list = []n_epoch: int = 0testing_commands: dict = {}accuracy: float = 0.0status: str = "incomplete"limit: int = 0
它包含一个独特的ID,用于在接下来的几周内调用训练过程,一个模型文件名的名称,作为我们用来触发训练的字典的训练命令,以及训练损失,用于在每次反向传播时观察训练状态,用于创建实时可视图形的周期数,测试命令用于触发测试过程的字典,准确率用于存储模型的准确率,状态用于观察其状态,以及测试过程中使用的限制。
让我们初始化这个类来与模型数据库进行通信:
class ModelsDatabase:def __init__(self):username = "<Storage Server Username>"password = "<Storage Server Password>"bucket_name = "images"auth = PasswordAuthenticator(username,password)timeout_opts = ClusterTimeoutOptions(kv_timeout=timedelta(seconds=10))self.cluster = Cluster('couchbase://localhost', ClusterOptions(auth, timeout_options=timeout_opts))self.cluster.wait_until_ready(timedelta(seconds=5))cb = self.cluster.bucket(bucket_name)self.cb_coll = cb.scope("model").collection("attempt")def insert_attempt(self, doc: Attempt):doc = doc.dict()print("\nInsert CAS: ")try:key = doc["name"]result = self.cb_coll.insert(key, doc)print(result.cas)except Exception as e:print(e)def get_attempt_by_name(self, name):try:sql_query = 'SELECT attempt FROM `images`.model.attempt WHERE name = $1'row_iter = self.cluster.query(sql_query,QueryOptions(positional_parameters=[name]))rows_arr = []for row in row_iter:rows_arr.append(row)return rows_arr[0]['attempt']except Exception as e:print(e)def get_attempt_by_id(self, id):try:sql_query = 'SELECT attempt FROM `images`.model.attempt WHERE id = $1'row_iter = self.cluster.query(sql_query,QueryOptions(positional_parameters=[id]))rows_arr = []for row in row_iter:rows_arr.append(row)return rows_arr[0]['attempt']except Exception as e:print(e)def update_attempt(self, doc: Attempt):try:key = doc.nameresult = self.cb_coll.upsert(key, doc.dict())except Exception as e:print(e)def get_latest_index(self):try:sql_query = 'SELECT COUNT(*) as latest_index FROM `images`.model.attempt'row_iter = self.cluster.query(sql_query,QueryOptions())for row in row_iter:return row['latest_index']except Exception as e:print(e)
还有一些在主文件中的辅助端点:
@app.post("/create_attempt")
def create_attempt(a: Attempt):db = ModelsDatabase()db.insert_attempt(a)return {"status": "Success"}@app.post("/find_attempt/")
def find_attempt(name: str):db = ModelsDatabase()attempt = db.get_attempt_by_name(name)return attempt@app.post("/update_attempt")
def update_attempt(a: Attempt):db = ModelsDatabase()db.update_attempt(a)return {"status": "Success"}@app.post("/latest_attempt_index/")
def return_index():db = ModelsDatabase()index = db.get_latest_index()return {"status": index}
让我们更新训练端点,在数据库中为我们创建一个模型对象:
@app.post("/train/")
async def train(tc: TrainCommands, background_tasks: BackgroundTasks):def background_training(tc):if 'name' in tc.model and tc.model['name'] != "":model = eval(tc.model['name'])else:model = CustomModeltry:a = find_attempt(name = tc.model_name)a["status"] = "Training"a["training_losses"] = []a = Attempt(**a)update_attempt(a)index = a.idexcept:index = return_index()['status']a = Attempt(name=tc.model_name, training_commands = tc.dict(), status = "Training", n_epoch=tc.n_epoch, id=index)create_attempt(a=a)trainer = Train(tc, model, CustomImageDataLoader, CustomImageDataset, ImagesDataBase)trainer.train()model = Nonetry:torch.cuda.empty_cache()except:passbackground_tasks.add_task(background_training, tc)return {"status": "Complete"}
让我们在训练过程中收集我们的损失(也是梯度步骤的调度器):
def train(self):device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')Epoch = [x for x in range(0,self.n_epoch)]Loss = [0] * self.n_epochscheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(self.optimizer, 'min')for epoch in range(self.n_epoch):running_loss = 0.0inputs, labels = self.loader.iterate_training()inputs, labels = inputs.to(device), labels.to(device)self.optimizer.zero_grad()if torch.cuda.is_available():self.model.cuda()outputs = self.model(inputs).to(device)else:outputs = self.model(inputs)loss = self.criterion(outputs, labels.squeeze())loss.backward()self.optimizer.step()running_loss = running_loss + loss.item()scheduler.step(running_loss)from main import find_attempt, update_attempta = find_attempt(name = self.model_name)a['training_losses'].append(running_loss)a = Attempt(**a)update_attempt(a)if epoch % 5 == 4:print(f'[Epoch: {epoch + 1}, Progress: {((epoch+1)*100/self.n_epoch):.3f}%] loss: {running_loss:.6f}')running_loss = 0.0torch.save(self.model.state_dict(), "models/{}.pt".format(self.model_name))
测试过程的另一个更新是异步的,并且与存储服务器通信:
@app.post("/test/")
async def test(tc: TestCommands, background_tasks: BackgroundTasks):def background_testing(tc):if 'name' in tc.model and tc.model['name'] != "":model = eval(tc.model['name'])else:model = CustomModeltry:a = find_attempt(name = tc.model_name)a["testing_commands"] = tc.dict()a["status"] = "Testing"a = Attempt(**a)update_attempt(a)except:return {"status": "No Model Attempt by that Name"}tester = Test(tc, CustomImageDataset, ImagesDataBase, model)accuracy = tester.test_accuracy()a = find_attempt(name = tc.model_name)a["accuracy"] = accuracya["status"] = "Complete"a = Attempt(**a)update_attempt(a)model = Nonetry:torch.cuda.empty_cache()except:passbackground_tasks.add_task(background_testing, tc)return {"status": "Success"}
以下是教程中产生的存储项目结果:
{"id": 4,"name": "american_dog_species_3","training_commands": {"batch_size": 4,"criterion": {"name": "CrossEntropyLoss"},"image_ops": [{"resize": {"resample": "Image.ANTIALIAS","size": [500, 500]}}, {"convert": {"mode": "'RGB'"}}],"label_names": ["American Hairless Terrier imagesize:500x500", "Alaskan Malamute imagesize:500x500", "American Eskimo Dog imagesize:500x500", "Australian Shepherd imagesize:500x500", "Boston Terrier imagesize:500x500", "Boykin Spaniel imagesize:500x500", "Chesapeake Bay Retriever imagesize:500x500", "Catahoula Leopard Dog imagesize:500x500", "Toy Fox Terrier imagesize:500x500"],"model": {"layers": [{"in_channels": 3,"kernel_size": 5,"name": "Conv2d","out_channels": 6}, {"inplace": true,"name": "ReLU"}, {"kernel_size": 2,"name": "MaxPool2d","stride": 2}, {"in_channels": "auto","kernel_size": 5,"name": "Conv2d","out_channels": 16}, {"inplace": true,"name": "ReLU"}, {"kernel_size": 2,"name": "MaxPool2d","stride": 2}, {"in_channels": "auto","kernel_size": 5,"name": "Conv2d","out_channels": 32}, {"inplace": true,"name": "ReLU"}, {"kernel_size": 2,"name": "MaxPool2d","stride": 2}, {"name": "Flatten","start_dim": 1}, {"in_features": 111392,"name": "Linear","out_features": 120}, {"inplace": true,"name": "ReLU"}, {"in_features": "auto","name": "Linear","out_features": 84}, {"inplace": true,"name": "ReLU"}, {"in_features": "auto","name": "Linear","out_features": "n_labels"}],"name": ""},"model_name": "american_dog_species_3","n_epoch": 100,"n_labels": 0,"optimizer": {"lr": 0.001,"momentum": 0.9,"name": "SGD"},"target_transform": {"ToTensor": true},"transform": {"Normalize": {"mean": [0.5, 0.5, 0.5],"std": [0.5, 0.5, 0.5]},"ToTensor": true}},"training_losses": [2.1530826091766357, 2.2155375480651855, 2.212409019470215, 2.171882152557373, 2.193148374557495, 2.174982786178589, 2.2089200019836426, 2.166707992553711, 2.1700942516326904, 2.196320056915283, 2.228410243988037, 2.2278425693511963, 2.1531643867492676, 2.1904003620147705, 2.1973652839660645, 2.1950249671936035, 2.1686930656433105, 2.182337999343872, 2.2186434268951416, 2.2066121101379395, 2.172186851501465, 2.217101573944092, 2.2250301837921143, 2.22577166557312, 2.2089788913726807, 2.1954753398895264, 2.19649338722229, 2.1682443618774414, 2.2124178409576416, 2.1765542030334473, 2.15944766998291, 2.2267537117004395, 2.1671102046966553, 2.218825101852417, 2.2200405597686768, 2.1963484287261963, 2.199852705001831, 2.2375543117523193, 2.1804018020629883, 2.2097158432006836, 2.1749439239501953, 2.213040351867676, 2.2149901390075684, 2.1947004795074463, 2.164980411529541, 2.1940670013427734, 2.229835033416748, 2.2061691284179688, 2.2089390754699707, 2.207270622253418, 2.235719680786133, 2.185238838195801, 2.222529411315918, 2.1917202472686768, 2.214961528778076, 2.181013584136963, 2.2280330657958984, 2.2193360328674316, 2.2151079177856445, 2.1822409629821777, 2.181617498397827, 2.213880777359009, 2.2002997398376465, 2.221768379211426, 2.1861824989318848, 2.191596508026123, 2.2087886333465576, 2.1659762859344482, 2.1675500869750977, 2.1987595558166504, 2.2219362258911133, 2.2185418605804443, 2.2019474506378174, 2.2085072994232178, 2.168557643890381, 2.1841750144958496, 2.206641674041748, 2.165733814239502, 2.193709373474121, 2.2362961769104004, 2.1809918880462646, 2.1982641220092773, 2.237257242202759, 2.2146575450897217, 2.197037935256958, 2.193465232849121, 2.1990575790405273, 2.193073272705078, 2.2431421279907227, 2.204183578491211, 2.235936164855957, 2.221945285797119, 2.185289144515991, 2.1666038036346436, 2.1959757804870605, 2.171337604522705, 2.1832592487335205, 2.2154834270477295, 2.168503761291504, 2.2134923934936523],"n_epoch": 100,"testing_commands": {"criterion": {"name": "CrossEntropyLoss"},"ids": [],"image_ops": [{"resize": {"resample": "Image.ANTIALIAS","size": [500, 500]}}, {"convert": {"mode": "'RGB'"}}],"label_names": ["American Hairless Terrier imagesize:500x500", "Alaskan Malamute imagesize:500x500", "American Eskimo Dog imagesize:500x500", "Australian Shepherd imagesize:500x500", "Boston Terrier imagesize:500x500", "Boykin Spaniel imagesize:500x500", "Chesapeake Bay Retriever imagesize:500x500", "Catahoula Leopard Dog imagesize:500x500", "Toy Fox Terrier imagesize:500x500"],"limit": 200,"model": {"layers": [{"in_channels": 3,"kernel_size": 5,"name": "Conv2d","out_channels": 6}, {"inplace": true,"name": "ReLU"}, {"kernel_size": 2,"name": "MaxPool2d","stride": 2}, {"in_channels": "auto","kernel_size": 5,"name": "Conv2d","out_channels": 16}, {"inplace": true,"name": "ReLU"}, {"kernel_size": 2,"name": "MaxPool2d","stride": 2}, {"in_channels": "auto","kernel_size": 5,"name": "Conv2d","out_channels": 32}, {"inplace": true,"name": "ReLU"}, {"kernel_size": 2,"name": "MaxPool2d","stride": 2}, {"name": "Flatten","start_dim": 1}, {"in_features": 111392,"name": "Linear","out_features": 120}, {"inplace": true,"name": "ReLU"}, {"in_features": "auto","name": "Linear","out_features": 84}, {"inplace": true,"name": "ReLU"}, {"in_features": "auto","name": "Linear","out_features": "n_labels"}],"name": ""},"model_name": "american_dog_species_3","n_labels": 0,"target_transform": {"ToTensor": true},"transform": {"Normalize": {"mean": [0.5, 0.5, 0.5],"std": [0.5, 0.5, 0.5]},"ToTensor": true}},"accuracy": 0.16500000000000006,"status": "Complete","limit": 0
}