程序员 sql面试_非程序员SQL使用指南

程序员 sql面试

Today, the word of the moment is DATA, this little combination of 4 letters is transforming how all companies and their employees work, but most people don’t really know how data behaves or how to access it and they also think that this is just for the tech dude from the IT team or someone who knows to code.

今天,关键是DATA,这四个字母的小组合正在改变所有公司及其员工的工作方式,但是大多数人并不真正了解数据的行为方式或访问方式,他们还认为这仅仅是来自IT团队或知道编码的人的技术花花公子。

So this article is going to explain a part of this complex world in an easy way and we shall begin by the beginning… the data.

因此,本文将以一种简单的方式来解释这个复杂世界的一部分,我们将从头开始……数据。

Basically, we can segment data into two groups, Structured and Unstructured.

基本上,我们可以将数据分为两组,即结构化非结构化

Image for post
Source: https://lawtomated.com/structured-data-vs-unstructured-data-what-are-they-and-why-care/
资料来源: https : //lawtomated.com/structured-data-vs-unstructured-data-what-are-they-and-why-care/

As you can see, Unstructured data is way more abundant than Structured, but despite that what is more common to use in the usual day-to-day is the Structured one and this occurs for some reasons as:

如您所见,非结构化数据比结构化数据要丰富得多,尽管如此,在日常日常使用中更常见的是结构化数据,这种情况发生的原因如下:

  • Can be displayed in rows and columns

    可以显示在行和列中
  • Requires less storage

    需要更少的存储空间
  • Easier to manage and manipulate

    易于管理和操纵

I’m not here to say that this 80% is not important, quite the opposite, but it’s just a bit more complex to deal with it so it’s no the focus of this text.

我并不是要说这80%并不重要,恰恰相反,但是处理起来稍微复杂一点,所以它不是本文的重点。

Having this explained, we know that our objective is to learn a little bit of how can we access and manipulate this kind of data.

对此进行了解释后,我们知道我们的目标是学习一些有关如何访问和操作此类数据的知识。

结构化数据概念 (Structured Data Concepts)

Let’s use this image below as an example:

让我们以下面的图片为例:

Image for post
Source: https://neo4j.com/blog/rdbms-graphs-why-relational-databases-arent-enough/
资料来源: https : //neo4j.com/blog/rdbms-graphs-why-relational-databases-arent-enough/

Structured data is organized in Tables, in this image, we have three and they are Persons, Dept_Members, and Department.

结构化数据按表格组织,在此图像中,我们有三个,分别是Persons,Dept_Members和Department。

Each Table is organized in Columns and Rows.

每个表都按组织

Each column has a data type, depending on the database that you are using the name or the number of available data types can change, but basically we have in a macro vision Strings, Numbers, Dates, and Timestamps.

每列都有一种数据类型 ,具体取决于您使用的数据库的名称或可用数据类型的数量可以更改,但是从本质上讲 ,我们在宏方面具有StringsNumbersDatesTimestamps

  • Strings: Everything that is a text.

    字符串:一切都是文本。
  • Numbers: Everything that is, obviously, a number.

    数字:显然所有的东西都是数字。
  • Dates: Only dates are accepted, it doesn’t count with hours, minutes, and seconds.

    日期:仅接受日期,不计小时,分钟和秒。
  • Timestamps: Dates with hours, minutes, and seconds.

    时间戳记:带有小时,分钟和秒的日期。

As I said, some databases change some names or have some more specific uses, for example:

正如我所说,某些数据库会更改某些名称或具有某些更特定的用途,例如:

In Oracle when we want to declare a string column we can call it by VARCHAR2 or CHAR, the difference between them is the number of characters that they deal with (char stores only one character while varchar2 stores N) and if we look to Google Big Query we just have the String data type for all cases of text data.

在Oracle中,当我们想声明一个字符串列时,可以通过VARCHAR2或CHAR来调用它,它们之间的区别是它们处理的字符数(char仅存储一个字符,而varchar2存储N),并且如果我们查看Google Big查询我们仅具有用于文本数据所有情况的String数据类型。

Well, once we have spoken about the columns now what left is to talk about the rows. Basically, each row is a record of the Table and the one very important question is “How can we differentiate one record from another? What does separate them?”

好了,一旦我们谈论了列,剩下的就是谈论行。 基本上,每一行都是表的记录,一个非常重要的问题是“如何区分一条记录与另一条记录? 它们之间有什么区别?”

The answer is the Primary Key.

答案是主键

The combination of columns of a record in a table that makes it unique is the primary key. Some tables have a specific column that works as an index, this works as a primary key too, but it does not show to you what makes it unique.

使记录唯一的表中各列的组合是主键。 有些表有一个特定的列用作索引,该列也用作主键,但是并没有向您显示使其独特的原因。

Image for post
Index/Id example, Source: http://gavo.mpa-garching.mpg.de/Millennium/Help/databaseconcepts
索引/ ID示例,来源: http : //gavo.mpa-garching.mpg.de/Millennium/Help/databaseconcepts
Image for post
Source: https://docs.microsoft.com/pt-br/sql/relational-databases/tables/primary-and-foreign-key-constraints?view=sql-server-ver15
来源: https : //docs.microsoft.com/pt-br/sql/relational-databases/tables/primary-and-foreign-key-constraints?view=sql-server-ver15

Some databases describe it in the table documentation, but if you don’t have this information don’t be afraid and explore your dataset!

一些数据库在表文档中对此进行了描述,但是,如果您没有此信息,请不要害怕并探索您的数据集!

And here comes the main goal of the article: How do I explore it?

这是本文的主要目标: 如何探索它?

结构化查询语言(SQL) (Structured Query Language (SQL))

Structured Query Language is the standard declarative search language for relational database

结构化查询语言是关系数据库的标准声明式搜索语言

This text above it the dictionary explanation of what is SQL, but we can translate it by the code language that lets us get data from one table or a combination of them and how their data is related.

上面的文本对什么是SQL进行了字典解释,但是我们可以通过使您可以从一个表或它们的组合中获取数据以及它们的数据如何相关的代码语言来翻译它。

Resuming it at max we can say that the standard SQL query has “only”, with huge quotes here, 3 elements:

以最大的速度恢复它,我们可以说标准SQL查询具有“ only”(在此处带有引号)三个元素:

  • SELECT: where you define what you want to pick from your tables.

    SELECT :您在其中定义要从表中选择的内容。

  • FROM: where you define which tables you are going to use and their relationship.

    FROM :您将在其中定义要使用的表及其关系。

  • WHERE: where you define what you want and do not want to see.

    在哪里 :您可以在其中定义想要和不想看到的内容。

This is how a query looks like.

这就是查询的样子。

Image for post
Source: the writer
资料来源:作家

Now, what we can understand here:

现在,我们在这里可以理解的是:

We want to get data from columns A, B, C, and D from TABLE_1 that is in the SCHEMA_1 (that is like a folder of tables) and we desire just rows with code ‘0001’ in column A.

我们想从SCHEMA_1中的TABLE_1的A,B,C和D列获取数据(就像表的文件夹),并且我们只希望A列中的代码为“ 0001”的行。

It was easy, isn’t it? Let’s get a little bit more complex example.

很简单,不是吗? 让我们来看一些更复杂的例子。

In column C we have a number (it could be sales quantity, stock projection, purchase order quantity, etc) and we want to sum the values by column A (again, it could be a store or product codes) and column B (maybe a date).

在C列中,我们有一个数字(可能是销售数量,库存预测,采购订单数量等),我们想按A列(同样可以是商店或产品代码)和B列(可能是一个约会)。

We also want to order it first by column A and after column B.

我们还希望先按A列然后按B列对其进行排序。

Image for post
Source: the writer
资料来源:作家

Now, when we want to aggregate some value based on another attribute we have to say “Look, I’m aggregating this guy here ( C ) this way (sum) and by these two dudes (group by A and B).”

现在,当我们要基于另一个属性汇总一些值时,我们必须说“看,我正在以这种方式(和)并通过这两个花花公子(按A和B分组)来汇总此人(C)。”

By the end, this isn’t too different from the last one, right?

最后,这与上一个没有太大不同,对吗?

表之间的关系 (Relationship Between Tables)

So, until now, all examples were for querying data of only one table at a time, but what we have to do if we want to merge data from two or more different tables?

因此,到目前为止,所有示例都仅一次查询一个表的数据,但是如果要合并来自两个或多个不同表的数据该怎么办?

The answer is simples, we must say how they relate to each other by simply specifying which columns have equivalent data.

答案很简单,我们必须通过简单地指定哪些列具有等效数据来说明它们之间的关系。

Now our example is a sales table, and there I only have center and product codes, and product stock quantity, but I want to get product and center names too but both informations are from other tables.

现在我们的示例是一个销售表,那里只有中心代码和产品代码以及产品库存数量,但是我也想获得产品和中心名称,但这两个信息都来自其他表。

I’ll say too that I just want to see the stock quantity that is higher than 100 units.

我也要说的是,我只想查看高于100个单位的库存数量。

Image for post

Let’s focus on the differences between the last example and this one. What is new here?

让我们集中讨论最后一个示例和这个示例之间的区别。 这里有什么新东西?

Tables can have nicknames, this is commonly used when the query has more than one table on it.

表格可以有昵称 ,当查询上有多个表格时,通常使用该昵称

Using this you don’t have to write the whole table location which time you reference it.

使用此方法,您不必在引用该表时就编写整个表的位置。

It’s also important because when we join tables we can have the same column names in both tables and we must pass in the query if we are using PLNT_CD from DIM_PLNT or from FT_SLS, otherwise, the query doesn’t know from which table it has to considerate.

这一点也很重要,因为当我们联接表时,两个表中的列名可以相同,并且如果我们使用的是DIM_PLNT或FT_SLS的PLNT_CD,则必须传递查询,否则,查询将不知道从哪个表周到。

Join is the way you combine data from tables. Always think in two tables at a time, one is called Left and another is the Right.

连接是合并表中数据的方式。 总是一次在两个表中思考,一个称为左,另一个称为右。

Left has a conjunct of records called L, Right has another conjunct called R and some records exist in both.

左有一个称为L的记录的连接,右有另一个称为R的连接,并且两者中都存在一些记录。

Joins can be of several types, the one that is shown in the example is the Left Join, this means that we are going to use only the records of the left table and bring values of Right that has a corresponding value in Left.

联接可以有几种类型,示例中显示的联接是“左联接”,这意味着我们将仅使用左表的记录,并在“左”中带入具有相应值的“右”值。

When we are comparing columns from tables to create the join between them we have to remember that is necessary for the relationship to be sealed they must be of the same data type.

当我们比较表中的列以创建它们之间的联接时,我们必须记住,密封关系是必要的,它们必须具有相同的数据类型

In this example, we can see that in the join we had to convert the PLNT_CD field of SLS table to STRING, otherwise the join was not able to be consolidated.

在此示例中,我们可以看到在连接中我们必须将SLS表的PLNT_CD字段转换为STRING,否则无法合并该连接。

Inside the Where clause, we have a new struct called Between, it is used to filter a range of data. By syntax, it has to be higher than the first parameter and lower than the second one.

在Where子句中,我们有一个称为Between的新结构,用于过滤一系列数据。 通过语法,它必须高于第一个参数,并且小于第二个参数。

By last, when we have a SUM() or a MEAN() or any other math applied in the query we maybe desire to filter some more specific results and the Having helps us to achieve it by letting us filter the final result of the query before it is shown.

最后,当我们在查询中应用了SUM()或MEAN()或任何其他数学运算时,我们可能希望过滤一些更具体的结果,而Haveing通过让我们过滤查询的最终结果来帮助我们实现此目标在显示之前。

We are getting closer to the end of this article, but before we finish it…

我们接近本文的结尾,但是在完成本文之前……

提示 (Tips)

  • In case you don’t know what you want to see in the table you can just use a * to return all columns of the table in the query.

    如果您不知道要在表中看到什么,则可以使用*返回查询中表的所有列。
SELECT
TABLE.*
FROM
SCHEMA.TABLE
  • Basically we have two types of tables, dimensional (DIM) and fact (FT).

    基本上,我们有两种类型的表,维度(DIM)和事实(FT)。

Dimensional tables store data that is attributed to a store or a product like its name, address, shape, size, etc, think about the data up to date or you maybe say it’s today’s value.

维度表存储归因于商店或产品的数据,例如其名称,地址,形状,大小等,考虑一下最新的数据,或者您可以说它是当今的价值。

Fact tables store data related to transactional information as purchase orders or sales tickets, so it brings the data of the moment of an event.

事实表将与交易信息相关的数据存储为采购订单或销售单,因此它会带来事件发生时的数据。

  • Types of joins

    联接类型

There are several types of joins, I could basically write another article just with this theme, but I found this resume that explains a little bit of them.

联接有几种类型,我基本上可以只用该主题写另一篇文章,但是我发现这份简历可以解释其中的一些内容。

Image for post
Source: https://stackoverflow.com/questions/36882478/how-to-do-sql-joins-in-lambda/36883214
资料来源: https : //stackoverflow.com/questions/36882478/how-to-do-sql-joins-in-lambda/36883214
  • In and not in

    在而不在

Sometimes the need is to get data not of a single product or location, but of a list of them.

有时,需要获取的数据不是单个产品或位置的数据,而是列表中的数据。

In these cases, you could use the operator IN or NOT IN in the WHERE to set as a parameter a list of desired variables instead of an infinity repetition of ANDs and ORs searching by one parameter at a time.

在这些情况下,可以在WHERE中使用运算符IN或NOT IN将所需变量列表设置为参数,而不是一次按一个参数搜索的AND和OR的无穷重复。

就是这样! (And this is it!)

Well, with this I think you can now use SQL to access your data with a little more ease!

好吧,我想您现在可以使用SQL来更轻松地访问数据!

I hope this article has helped you!

希望本文对您有所帮助!

翻译自: https://medium.com/swlh/sql-use-guide-for-non-programmers-5997af000c5f

程序员 sql面试

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/388469.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

r a/b 测试_R中的A / B测试

r a/b 测试什么是A / B测试? (What is A/B Testing?) A/B testing is a method used to test whether the response rate is different for two variants of the same feature. For instance, you may want to test whether a specific change to your website lik…

Java基础回顾

内容: 1、Java中的数据类型 2、引用类型的使用 3、IO流及读写文件 4、对象的内存图 5、this的作用及本质 6、匿名对象 1、Java中的数据类型 Java中的数据类型有如下两种: 基本数据类型: 4类8种 byte(1) boolean(1) short(2) char(2) int(4) float(4) l…

计算机部分应用显示模糊,win10系统打开部分软件字体总显示模糊的解决方法-电脑自学网...

win10系统打开部分软件字体总显示模糊的解决方法。方法一:win10软件字体模糊1、首先,在Win10的桌面点击鼠标右键,选择“显示设置”。2、在“显示设置”的界面下方,点击“高级显示设置”。3、在“高级显示设置”的界面中&#xff0…

Tomcat调节

Tomcat默认可以使用的内存为128MB,在较大型的应用项目中,这点内存是不够的,需要调大,并且Tomcat本身不能直接在计算机上运行,需要依赖于硬件基础之上的操作系统和一个java虚拟机。 AD: 这里向大家描述一下如何使用Tom…

turtle 20秒画完小猪佩奇“社会人”

转载:https://blog.csdn.net/csdnsevenn/article/details/80650456 图片源自网络 作者 丁彦军 如需转载,请联系原作者授权。 今年社交平台上最火的带货女王是谁?范冰冰?杨幂?Angelababy?不,是猪…

最佳子集aic选择_AutoML的起源:最佳子集选择

最佳子集aic选择As there is a lot of buzz about AutoML, I decided to write about the original AutoML; step-wise regression and best subset selection. Then I decided to ignore step-wise regression because it is bad and should probably stop being taught. That…

Java虚拟机内存溢出

最近在看周志明的《深入理解Java虚拟机》,虽然刚刚开始看,但是觉得还是一本不错的书。对于和我一样对于JVM了解不深,有志进一步了解的人算是一本不错的书。注明:不是书托,同样是华章出的书,质量要比《深入剖…

用户输入汉字时计算机首先将,用户输入汉字时,计算机首先将汉字的输入码转换为__________。...

用户的蓄的形能器常见式有。输入时计算机首先输入包括药物具有基的酚羟。汉字换物包腺皮括质激肾上素药。对既荷又有线有相间负负荷时,将汉倍作为等选取相负效三相负荷乘荷最大,将汉相负荷换荷应先将线间负算为,效三相负荷时在计算等&#xf…

从最终用户角度来看外部结构_从不同角度来看您最喜欢的游戏

从最终用户角度来看外部结构The complete python code and Exploratory Data Analysis Notebook are available at my github profile;完整的python代码和Exploratory Data Analysis Notebook可在我的github个人资料中找到 ; Pokmon is a Japanese media franchise,…

apache+tomcat配置

无意间看到tomcat 6集群的内容,就尝试配置了一下,还是遇到很多问题,特此记录。apache服务器和tomcat的连接方法其实有三种:JK、http_proxy和ajp_proxy。本文主要介绍最为常见的JK。 环境:PC2台:pc1(IP 192.168.88.118…

记自己在spring中使用redis遇到的两个坑

本人在spring中使用redis作为缓存时&#xff0c;遇到两个坑&#xff0c;现在记录如下&#xff0c;算是作为自己的备忘吧&#xff0c;文笔不好&#xff0c;望大家见谅&#xff1b; 一、配置文件 1 <!-- 加载Properties文件 -->2 <bean id"configurer" cl…

Azure实践之如何批量为资源组虚拟机创建alert

通过上一篇的简介&#xff0c;相信各位对于简单的创建alert&#xff0c;以及Azure monitor使用以及大概有个印象了。基础的使用总是非常简单的&#xff0c;这里再分享一个常用的alert使用方法实际工作中&#xff0c;不管是日常运维还是做项目&#xff0c;我们都需要知道VM的实际…

管道过滤模式 大数据_大数据管道配方

管道过滤模式 大数据介绍 (Introduction) If you are starting with Big Data it is common to feel overwhelmed by the large number of tools, frameworks and options to choose from. In this article, I will try to summarize the ingredients and the basic recipe to …

DevOps时代,企业数字化转型需要强大的工具链

伴随时代的飞速进步&#xff0c;中国的人口红利带来了互联网业务的快速发展&#xff0c;巨大的流量也带动了技术的不断革新&#xff0c;研发的模式也在不断变化。传统企业纷纷效仿互联网的做法&#xff0c;结合DevOps进行数字化的转型。通常提到DevOps&#xff0c;大家浮现在脑…

用户体验可视化指南pdf_R中增强可视化的初学者指南

用户体验可视化指南pdfLearning to build complete visualizations in R is like any other data science skill, it’s a journey. RStudio’s ggplot2 is a useful package for telling data’s story, so if you are newer to ggplot2 and would love to develop your visua…

linux挂载磁盘阵列

linux挂载磁盘阵列 在许多项目中&#xff0c;都会把数据存放于磁盘阵列&#xff0c;以确保数据安全或者实现负载均衡。在初始安装数据库系统和数据恢复时&#xff0c;都需要先挂载磁盘阵列到系统中。本文记录一次在linux系统中挂载磁盘的操作步骤&#xff0c;以及注意事项。 此…

sql横着连接起来sql_SQL联接的简要介绍(到目前为止)

sql横着连接起来sqlSQL Join是什么意思&#xff1f; (What does a SQL Join mean?) A SQL join describes the process of merging rows in two different tables or files together.SQL连接描述了将两个不同表或文件中的行合并在一起的过程。 Rows of data are combined bas…

《Python》进程收尾线程初识

一、数据共享 from multiprocessing import Manager 把所有实现了数据共享的比较便捷的类都重新又封装了一遍&#xff0c;并且在原有的multiprocessing基础上增加了新的机制list、dict 机制&#xff1a;支持的数据类型非常有限 list、dict都不是数据安全的&#xff0c;需要自己…

北京修复宕机故障之旅

2012-12-18日 下午开会探讨北京项目出现的一些问题&#xff0c;当时记录的问题是由可能因为有一定数量的客户上来后&#xff0c;就造成了Web服务器宕机&#xff0c;而且没有任何时间上的规律性&#xff0c;让我准备出差到北京&#xff0c;限定三天时间&#xff0c;以及准备测试…

一般线性模型和混合线性模型_从零开始的线性混合模型

一般线性模型和混合线性模型生命科学的数学统计和机器学习 (Mathematical Statistics and Machine Learning for Life Sciences) This is the eighteenth article from the column Mathematical Statistics and Machine Learning for Life Sciences where I try to explain som…