将PDF和Gutenberg文档格式转换为文本:生产中的自然语言处理

Estimates state that 70%–85% of the world’s data is text (unstructured data). Most of the English and EU business data formats as byte text, MS Word, or Adobe PDF. [1]

据估计,全球数据的70%–85%是文本(非结构化数据)。 大多数英语和欧盟业务数据格式为字节文本,MS Word或Adobe PDF。 [1]

Organizations web displays of Adobe Postscript Document Format documents (PDF). [2]

组织Web显示Dobe Postscript文档格式文档( PDF )。 [2]

In this blog, I detail the following :

在此博客中,我将详细介绍以下内容:

  1. Create a file path from the web file name and local file name;

    从Web文件名和本地文件名创建文件路径;
  2. Change byte encoded Gutenberg project file into a text corpus;

    将字节编码的Gutenberg项目文件更改为文本语料库;
  3. Change a PDF document into a text corpus;

    将PDF文档更改为文本语料库;
  4. Segment continuous text into a Corpus of word text.

    将连续文本分割成单词文本的语料库。

将常用文档格式转换为文本 (Converting Popular Document Formats into Text)

1.从Web文件名或本地文件名创建本地文件路径 (1. Create local filepath from the web filename or local filename)

The following function will take either a local file name or a remote file URL and return a filepath object.

以下函数将采用本地文件名或远程文件URL并返回文件路径对象。

#in file_to_text.py
--------------------------------------------
from io import StringIO, BytesIO
import urllib
def file_or_url(pathfilename:str) -> Any:
"""
Reurn filepath given local file or URL.
Args:
pathfilename:
Returns:
filepath odject istance
"""
try:
fp = open(pathfilename, mode="rb") # file(path, 'rb')
except:
pass
else:
url_text = urllib.request.urlopen(pathfilename).read()
fp = BytesIO(url_text)
return fp

2.将Unicode字节编码的文件更改为Python Unicode字符串 (2. Change Unicode Byte encoded file into a o Python Unicode String)

You will often encounter text blob downloads in the size 8-bit Unicode format (in the romantic languages). You need to convert 8-bit Unicode into Python Unicode strings.

您经常会遇到8位Unicode格式的文本blob下载(浪漫语言)。 您需要将8位Unicode转换为Python Unicode字符串。

#in file_to_text.py
--------------------------------------------
def unicode_8_to_text(text: str) -> str:
return text.decode("utf-8", "replace")import urllib
from file_to_text import unicode_8_to_texttext_l = 250text_url = r'http://www.gutenberg.org/files/74/74-0.txt'
gutenberg_text = urllib.request.urlopen(text_url).read()
%time gutenberg_text = unicode_8_to_text(gutenberg_text)
print('{}: size: {:g} \n {} \n'.format(0, len(gutenberg_text) ,gutenberg_text[:text_l]))

output =>

输出=>

CPU times: user 502 µs, sys: 0 ns, total: 502 µs
Wall time: 510 µs
0: size: 421927

The Project Gutenberg EBook of The Adventures of Tom Sawyer, Complete by
Mark Twain (Samuel Clemens)
This eBook is for the use of anyone anywhere at no cost and with almost
no restrictions whatsoever. You may copy it, give it away or re-use
it under the terms of the Project Gutenberg License included with this
eBook or online at www.guten

The result is that text.decode('utf-8') can format into a Python string of a million characters in about 1/1000th second. A rate that far exceeds our production rate requirements.

结果是text.decode('utf-8')可以在大约1/1000秒内格式化为一百万个字符的Python字符串。 生产率远远超过我们的生产率要求。

3.将PDF文档更改为文本语料库。 (3. Change a PDF document into a text corpus.)

“Changing a PDF document into a text corpus" is one of the most troublesome and common tasks I do for NLP text pre-processing.

“将PDF文档转换为文本语料库 ”是我为NLP文本预处理所做的最麻烦,最常见的任务之一。

#in file_to_text.py
--------------------------------------------
def PDF_to_text(pathfilename: str) -> str:
"""
Chane PDF format to text.
Args:
pathfilename:
Returns:
"""
fp = file_or_url(pathfilename)
rsrcmgr = PDFResourceManager()
retstr = StringIO()
laparams = LAParams()
device = TextConverter(rsrcmgr, retstr, laparams=laparams)
interpreter = PDFPageInterpreter(rsrcmgr, device)
password = ""
maxpages = 0
caching = True
pagenos = set()
for page in PDFPage.get_pages(
fp,
pagenos,
maxpages=maxpages,
password=password,
caching=caching,
check_extractable=True,
):
interpreter.process_page(page)
text = retstr.getvalue()
fp.close()
device.close()
retstr.close()
return text
-------------------------------------------------------arvix_list =['https://arxiv.org/pdf/2008.05828v1.pdf'
, 'https://arxiv.org/pdf/2008.05981v1.pdf'
, 'https://arxiv.org/pdf/2008.06043v1.pdf'
, 'tmp/inf_finite_NN.pdf' ]
for n, f in enumerate(arvix_list):
%time pdf_text = PDF_to_text(f).replace('\n', ' ')
print('{}: size: {:g} \n {} \n'.format(n, len(pdf_text) ,pdf_text[:text_l])))

output =>

输出=>

CPU times: user 1.89 s, sys: 8.88 ms, total: 1.9 s
Wall time: 2.53 s
0: size: 42522
On the Importance of Local Information in Transformer Based Models Madhura Pande, Aakriti Budhraja, Preksha Nema Pratyush Kumar, Mitesh M. Khapra Department of Computer Science and Engineering Robert Bosch Centre for Data Science and AI (RBC-DSAI) Indian Institute of Technology Madras, Chennai, India {mpande,abudhra,preksha,pratyush,miteshk}@
CPU times: user 1.65 s, sys: 8.04 ms, total: 1.66 s
Wall time: 2.33 s
1: size: 30586
ANAND,WANG,LOOG,VANGEMERT:BLACKMAGICINDEEPLEARNING1BlackMagicinDeepLearning:HowHumanSkillImpactsNetworkTrainingKanavAnand1anandkanav92@gmail.comZiqiWang1z.wang-8@tudelft.nlMarcoLoog12M.Loog@tudelft.nlJanvanGemert1j.c.vangemert@tudelft.nl1DelftUniversityofTechnology,Delft,TheNetherlands2UniversityofCopenhagenCopenhagen,DenmarkAbstractHowdoesauser’sp
CPU times: user 4.82 s, sys: 46.3 ms, total: 4.87 s
Wall time: 6.53 s
2: size: 57204
0 2 0 2 g u A 3 1 ] G L . s c [ 1 v 3 4 0 6 0 . 8 0 0 2 : v i X r a Offline Meta-Reinforcement Learning with Advantage Weighting Eric Mitchell1, Rafael Rafailov1, Xue Bin Peng2, Sergey Levine2, Chelsea Finn1 1 Stanford University, 2 UC Berkeley em7@stanford.edu Abstract Massive datasets have proven critical to successfully
CPU times: user 12.2 s, sys: 36.1 ms, total: 12.3 s
Wall time: 12.3 s
3: size: 89633
0 2 0 2 l u J 1 3 ] G L . s c [ 1 v 1 0 8 5 1 . 7 0 0 2 : v i X r a Finite Versus Infinite Neural Networks: an Empirical Study Jaehoon Lee Samuel S. Schoenholz∗ Jeffrey Pennington∗ Ben Adlam†∗ Lechao Xiao∗ Roman Novak∗ Jascha Sohl-Dickstein {jaehlee, schsam, jpennin, adlam, xlc, romann, jaschasd}@google.com Google Brain

On this hardware configuration, “Converting a PDF file into Python string” requires 150 seconds per million characters. Not fast enough for a Web interactve production application.

在此硬件配置上,“ 将PDF文件转换为Python字符串 ”每百万个字符需要150秒。 对于Web interactve生产应用程序来说不够快。

You may want to stage formatting in the background.

您可能要在后台进行格式化。

4.将连续文本分割成单词文本的语料库 (4. Segment continuous text into Corpus of word text)

When we read https://arxiv.org/pdf/2008.05981v1.pdf', it came back as continuous text with no separation character. Using the package from wordsegment, we separate the continuous string into words.

当我们阅读https://arxiv.org/pdf/2008.05981v1.pdf'时 ,它以没有分隔符的连续文本形式出现。 使用来自wordsegment的包我们将连续的字符串分成单词。

from wordsegment import load,  clean, segment
%time words = segment(pdf_text)
print('size: {:g} \n'.format(len(words)))
' '.join(words)[:text_l*4]

output =>

输出=>

CPU times: user 1min 43s, sys: 1.31 s, total: 1min 44s
Wall time: 1min 44s
size: 5005'an and wang loog van gemert blackmagic in deep learning 1 blackmagic in deep learning how human skill impacts network training kanavanand1anandkanav92g mailcom ziqiwang1zwang8tudelftnl marco loog12mloogtudelftnl jan van gemert 1jcvangemerttudelftnl1 delft university of technology delft the netherlands 2 university of copenhagen copenhagen denmark abstract how does a users prior experience with deep learning impact accuracy we present an initial study based on 31 participants with different levels of experience their task is to perform hyper parameter optimization for a given deep learning architecture the results show a strong positive correlation between the participants experience and then al performance they additionally indicate that an experienced participant nds better solutions using fewer resources on average the data suggests furthermore that participants with no prior experience follow random strategies in their pursuit of optimal hyperparameters our study investigates the subjective human factor in comparisons of state of the art results and scientic reproducibility in deep learning 1 introduction the popularity of deep learning in various elds such as image recognition 919speech1130 bioinformatics 2124questionanswering3 etc stems from the seemingly favorable tradeoff between the recognition accuracy and their optimization burden lecunetal20 attribute their success t'

You will notice that wordsegment accomplishes a fairly accurate separation into words. There are some errors , or words that we don’t want, that NLP text pre-processing clear away.

您会注意到, wordsegment实现了相当准确的单词分离。 NLP文本预处理会清除一些错误或我们不希望使用的单词。

The Apache wordsegment is slow. It is barely adequate in production for small, less than 1 thousand word documents. Can we find some faster way to segment?

Apache 单词段速度很慢。 对于少于一千个单词的小型文档,它几乎不能满足生产要求。 我们可以找到更快的细分方式吗?

4b。 将连续文本分割成单词文本的语料库 (4b. Segment continuous text into Corpus of word text)

There seems to be a faster method to "Segment continuous text into Corpus of word text."

似乎有一种更快的方法“将连续文本分割成单词文本的语料库”。

As discussed in the following blog:

如以下博客中所述:

SymSpell is 100x -1000x faster. Wow!

SymSpell是100倍-1000x更快。 哇!

Note: ed: 8/24/2020 Wolf Garbe deserves credit for pointing out

注意:ed:8/24/2020 Wolf Garbe值得一提

The benchmark results (100x -1000x faster) given in the SymSpell blog post are referring solely to spelling correction, not to word segmentation. In that post SymSpell was compared to other spelling correction algorithms, not to word segmentation algorithms. — Wolfe Garbe 8/23/2020

SymSpell博客文章中给出的基准测试结果(快100倍-1000倍)仅指拼写校正,而不是指分词。 在那篇文章中,将SymSpell与其他拼写校正算法进行了比较,而不是与分词算法进行了比较。 -Wolfe Garbe 2020年8月23日

and

Also, there is an easier way to call a C# library from Python: https://stackoverflow.com/questions/7367976/calling-a-c-sharp-library-from-python — Wolfe Garbe 8/23/2020

此外,还有一种从Python调用C#库的简便方法: https ://stackoverflow.com/questions/7367976/calling-ac-sharp-library-from-python — Wolfe Garbe 8/23/2020

Note: ed: 8/24/2020. I am going to try Garbe's C## implementation. If I do not get the same results (and probably if I do) I will try cython port and see if I can fit into spacy as a pipeline element. I will let you know my results.

注意:ed:8/24/2020。 我将尝试Garbe的C ##实现。 如果没有得到相同的结果(可能的话),我将尝试cython port,看看是否可以将spacy作为管​​道元素使用。 我会让你知道我的结果。

However, it is implemented in C#. Since I am not going down the infinite ratholes of:

但是,它是用C#实现的。 由于我没有遇到以下无限困难:

  • Convert all my NLP into C#. Not a viable option.

    将我所有的NLP转换为C# 。 不是可行的选择。

  • Calling C# from Python. I talked to two engineer managers of Python groups. They have Python-C# capability, but it involves :

    Python调用C# 。 我与两位Python组的工程师经理进行了交谈。 它们具有Python-C#功能,但涉及到:

Note:

注意:

  1. Translating to VB-vanilla;

    翻译成VB -vanilla;

  2. Manual intervention and translation must pass tests for reproducibility;

    手动干预和翻译必须通过再现性测试;
  3. Translating from VB-vanilla to C;

    VB- vanilla转换为C ;

  4. Manual intervention and translation must pass tests for reproducibility.

    手动干预和翻译必须通过再现性测试。

Instead, we work with a port to Python. Here is a version:

相反,我们使用Python的端口。 这是一个版本:

def segment_into_words(input_term):
# maximum edit distance per dictionary precalculation
max_edit_distance_dictionary = 0
prefix_length = 7
# create object
sym_spell = SymSpell(max_edit_distance_dictionary, prefix_length)
# load dictionary
dictionary_path = pkg_resources.resource_filename(
"symspellpy", "frequency_dictionary_en_82_765.txt")
bigram_path = pkg_resources.resource_filename(
"symspellpy", "frequency_bigramdictionary_en_243_342.txt")
# term_index is the column of the term and count_index is the
# column of the term frequency
if not sym_spell.load_dictionary(dictionary_path, term_index=0,
count_index=1):
print("Dictionary file not found")
return
if not sym_spell.load_bigram_dictionary(dictionary_path, term_index=0,
count_index=2):
print("Bigram dictionary file not found")
returnresult = sym_spell.word_segmentation(input_term)
return result.corrected_string%time long_s = segment_into_words(pdf_text)
print('size: {:g} {}'.format(len(long_s),long_s[:text_l*4]))

output =>

输出=>

CPU times: user 20.4 s, sys: 59.9 ms, total: 20.4 s
Wall time: 20.4 s
size: 36585 ANAND,WANG,LOOG,VANGEMER T:BLACKMAGICINDEEPLEARNING1B lack MagicinDeepL earning :HowHu man S kill Imp acts Net work T raining Ka nav An and 1 an and kana v92@g mail . com ZiqiWang1z. wang -8@tu delft .nlM arc oLoog12M.Loog@tu delft .nlJ an van Gemert1j.c. vang emert@tu delft .nl1D elf tUniversityofTechn ology ,D elf t,TheN ether lands 2UniversityofC open hagen C open hagen ,Den mark Abs tract How does a user ’s prior experience with deep learning impact accuracy ?We present an initial study based on 31 participants with different levels of experience .T heir task is to perform hyper parameter optimization for a given deep learning architecture .T here -s ult s show a strong positive correlation between the participant ’s experience and the fin al performance .T hey additionally indicate that an experienced participant finds better sol u-t ions using fewer resources on average .T he data suggests furthermore that participants with no prior experience follow random strategies in their pursuit of optimal hyper pa-ra meters .Our study investigates the subjective human factor in comparisons of state of the art results and sci entific reproducibility in deep learning .1Intro duct ion T he popularity of deep learning in various fi eld s such as image recognition [9,19], speech [11,30], bio informatics [21,24], question answering [3] etc . stems from the seemingly fav or able trade - off b

SymSpellpy is is about 5x faster implemented in Python.We are not seeing 100x -1000x faster.

SymSpellpyPython中实现的速度要快大约5倍。我们看不到100倍-1000倍的速度。

I guess that SymSpell-C# is comparing to different segmentation algorithms implemented in Python.

我猜想SymSpell-C#正在与Python中实现的不同细分算法进行比较。

Perhaps we see speedup due to C#, a compiled statically typed language. Since C# and C are about the same computing speed, we should expect a speedup of C# 100x -1000x faster than a Python implementation.

也许由于C# (一种编译的静态类型语言)而使我们看到了加速。 由于C#C的计算速度大致相同,因此我们应该期望C#的加速比Python实现快100倍-1000倍。

Note: There is a spacy pipeline implementation spacy_symspell, which directly calls SymSpellpy. I recommend you don’t use spacy_symspell. Spacy first generates tokens as the first step of the pipeline, which is immutable. spacy_symspell generates new text from Segmenting continuous text. It can not generate new tokens in the spacy as spacy already generated tokens. .A spacy pipeline works a token sequence, not a stream of text. One would have to spin off a changed version of spacy. Why bother? Instead, segment continuous text into Corpus of word text. Then correct the text of embedded whitespace in a word and hyphenated words in the text. Do any other raw cleaning you want to do. Then feed the raw text to spacy.

注意:有一个spacy管道实现spacy_symspell,它直接调用SymSpellpy。 我建议您不要使用spacy_symspell。 Spacy首先生成令牌,这是流水线的第一步,这是不可变的。 spacy_symspell从生成新文本 分割连续文本。 由于spacy已生成令牌,因此无法在spacy中生成新令牌 .A spacy管道工程令牌序列,文本不流 人们将不得不衍生出一种变化的spacy版本 何必呢? 相反段连续的文本到文本字语料库。 然后更正单词中嵌入的空格文本和文本中带连字符的单词。 进行您想做的其他任何原始清洁。 然后将原始文本输入spacy

I show spacy_symspell. Again my advice is not to use it.

我展示spacy_symspell。 同样,我的建议是不要使用它。

import spacy
from spacy_symspell import SpellingCorrector
def segment_into_words(input_term):
nlp = spacy.load(“en_core_web_lg”, disable=[“tagger”, “parser”])
corrector = SpellingCorrector()
nlp.add_pipe(corrector)

结论 (Conclusion)

In future blogs, I will detail many common and uncommon Fast Text Pre-Processing Methods. Also, I will show the expected speedup from moving SymSpellpy to cython.

在以后的博客中,我将详细介绍许多常见和不常见的快速文本预处理方法。 另外,我将展示从SymSpellpy迁移cython的预期加速

There will be many more formats and APIs you need to support in the world of “Changing X format into a text corpus.”

在“将X格式更改为文本语料库”的世界中,您将需要支持更多的格式和API。

I detailed two of the more common document formats, PDF, and Gutenberg Project formats. Also, I gave two NLP utility functions segment_into_words and file_or_url.

我详细介绍了两种较常见的文档格式PDFGutenberg Project格式。 另外,我提供了两个NLP实用程序功能segment_into_wordsfile_or_url.

I hope you learned something and can use some of the code in this blog.

希望您学到了一些知识,并可以使用此博客中的一些代码。

If you have some format conversions or better yet a package of them, let me know.

如果您进行了某些格式转换或更好的转换,请告诉我 。

翻译自: https://towardsdatascience.com/natural-language-processing-in-production-converting-pdf-and-gutenberg-document-formats-into-text-9e7cd3046b33

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/392466.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

Go_笔试题记录-指针与值类型实现接口的区别

1、如果Add函数的调用代码为: func main() {var a Integer 1var b Integer 2var i interface{} &asum : i.(*Integer).Add(b)fmt.Println(sum) } 则Add函数定义正确的是() A.type Integer int func (a Integer) Add(b Integer) Intege…

leetcode 48. 旋转图像

解题思路 将数组从里到外分为若干层, 数组 [1,2,3], [4,5,6][7,8,9]的最外层即为 [1,2,3] [4 6][7,8,9] ,将一层分为4条边,如741 123,将741放到123的位置,123放到369的位置,如此类推(但是放置的…

如何恢复误删的OneNote页面

今天不小心把半个月的日记删掉了!(为了减少页面数量,每个月的日记会放在同一个页面上)。 幸运的是OneNote有自动备份功能,喜极而泣。 操作方法来自微软支持 打开丢失了最近笔记的笔记本。 单击“文件”>“信息”&g…

javascript函数式_JavaScript中的函数式编程原理

javascript函数式After a long time learning and working with object-oriented programming, I took a step back to think about system complexity.经过长时间的学习和使用面向对象的编程,我退后了一步来思考系统的复杂性。 “Complexity is anything that mak…

java writeint_Java DataOutputStream.writeInt(int v)类型

DataOutputStream.writeInt(int v)方法示例DataOutputStream的DataOutputStream.writeInt(int v)方法具有以下语法。public final void writeInt(int v) throws IOException示例在下面的代码中展示了如何使用DataOutputStream.writeInt(int v)方法。import java.io.DataInputSt…

协方差意味着什么_“零”到底意味着什么?

协方差意味着什么When I was an undergraduate student studying Data Science, one of my professors always asked the same question for every data set we worked with — “What does zero mean?”当我是一名研究数据科学的本科生时,我的一位教授总是对我们处…

Go_笔试题记录-不熟悉的

1、golang中没有隐藏的this指针,这句话的含义是() A. 方法施加的对象显式传递,没有被隐藏起来 B. golang沿袭了传统面向对象编程中的诸多概念,比如继承、虚函数和构造函数 C. golang的面向对象表达更直观,对…

leetcode 316. 去除重复字母(单调栈)

给你一个字符串 s ,请你去除字符串中重复的字母,使得每个字母只出现一次。需保证 返回结果的字典序最小(要求不能打乱其他字符的相对位置)。 注意:该题与 1081 https://leetcode-cn.com/problems/smallest-subsequenc…

Go-json解码到结构体

废话不多说,直接干就得了,上代码 package mainimport ("encoding/json""fmt" )type IT struct {Company string json:"company" Subjects []string json:"subjects"IsOk bool json:"isok"…

leetcode 746. 使用最小花费爬楼梯(dp)

数组的每个索引作为一个阶梯,第 i个阶梯对应着一个非负数的体力花费值 costi。 每当你爬上一个阶梯你都要花费对应的体力花费值,然后你可以选择继续爬一个阶梯或者爬两个阶梯。 您需要找到达到楼层顶部的最低花费。在开始时,你可以选择从索…

安卓中经常使用控件遇到问题解决方法(持续更新和发现篇幅)(在textview上加一条线、待续)...

TextView设置最多显示30个字符。超过部分显示...(省略号)&#xff0c;有人说分别设置TextView的android:signature"true",而且设置android:ellipsize"end";可是我试了。居然成功了&#xff0c;供大家參考 [java] view plaincopy<TextView android:id…

网络工程师晋升_晋升为工程师的最快方法

网络工程师晋升by Sihui Huang黄思慧 晋升为工程师的最快方法 (The Fastest Way to Get Promoted as an Engineer) We all want to live up to our potential, grow in our career, and do the best work of our lives. Getting promoted at work not only proves that we hav…

java 银行存取款_用Java编写银行存钱取钱

const readline require(‘readline-sync‘)//引用readline-synclet s 2;//错误的次数for (let i 0; i < 3; i) {console.log(‘请输入名&#xff1a;(由英文组成)‘);let user readline.question();console.log(‘请输入密码&#xff1a;(由数字组成)‘);let password …

垃圾邮件分类 python_在python中创建SMS垃圾邮件分类器

垃圾邮件分类 python介绍 (Introduction) I have always been fascinated with Google’s gmail spam detection system, where it is able to seemingly effortlessly judge whether incoming emails are spam and therefore not worthy of our limited attention.我一直对Goo…

leetcode 103. 二叉树的锯齿形层序遍历(层序遍历)

给定一个二叉树&#xff0c;返回其节点值的锯齿形层序遍历。&#xff08;即先从左往右&#xff0c;再从右往左进行下一层遍历&#xff0c;以此类推&#xff0c;层与层之间交替进行&#xff09;。例如&#xff1a; 给定二叉树 [3,9,20,null,null,15,7],3/ \9 20/ \15 7 返回…

简单易用的MongoDB

从我第一次听到Nosql这个概念到如今已经走过4个年头了&#xff0c;但仍然没有具体的去做过相应的实践。最近获得一段学习休息时间&#xff0c;购买了Nosql技术实践一书&#xff0c;正在慢慢的学习。在主流观点中&#xff0c;Nosql大体分为4类&#xff0c;键值存储数据库&#x…

html画布图片不显示_如何在HTML5画布上显示图像

html画布图片不显示by Nash Vail由Nash Vail Ok, so here’s a question: “Why do we need an article for this, Nash?”好的&#xff0c;这是一个问题&#xff1a;“为什么我们需要为此写一篇文章&#xff0c;纳什&#xff1f;” Well, grab a seat.好吧&#xff0c;坐下…

java断点续传插件_视频断点续传+java视频

之前仿造uploadify写了一个HTML5版的文件上传插件&#xff0c;没看过的朋友可以点此先看一下~得到了不少朋友的好评&#xff0c;我自己也用在了项目中&#xff0c;不论是用户头像上传&#xff0c;还是各种媒体文件的上传&#xff0c;以及各种个性的业务需求&#xff0c;都能得到…

全栈入门_启动数据栈入门包(2020)

全栈入门I advise a lot of people on how to build out their data stack, from tiny startups to enterprise companies that are moving to the cloud or from legacy solutions. There are many choices out there, and navigating them all can be tricky. Here’s a brea…

Go-json解码到接口及根据键获取值

Go-json解码到接口及根据键获取值 package mainimport ("encoding/json""fmt""github.com/bitly/go-simplejson" )type JsonServer struct {ServerName stringServerIP string }type JsonServers struct {Servers []JsonServer }func main() {…