jquery数据折叠
Sometimes your dataset is just too large, and you need a way to shrink it down to a reasonable size. I am suffering through this right now as I work on different machine learning techniques for checkers. I could work for over 18 years and buy over 10 petabytes of data to solve it, but I would rather just give up on some of the quality of the solution, get a program that can play checkers well, and use fewer resources in the process.
有时您的数据集过大,您需要一种将其缩小到合理大小的方法。 当我为跳棋员开发不同的机器学习技术时,我现在正遭受这种痛苦。 我可以工作18年以上,并购买超过10 PB的数据来解决它 ,但是我宁愿放弃一些解决方案的质量,获得一个可以很好地运行检查程序并在此过程中使用更少资源的程序。
One technique you can use is called bit folding. This technique is similar to hashing in that the function is one way and it can produce the same result for multiple inputs. This latter phenomenon is known as collision. While collision has a bad connotation when hashing, we need collision in order for this to work for data shrinking.
您可以使用的一种技术称为位折叠。 该技术类似于哈希,因为函数是一种方式,并且可以为多个输入产生相同的结果。 后一种现象称为碰撞。 尽管在散列时冲突具有不好的含义,但我们需要冲突才能使其在数据收缩时起作用。
When folding bits, you start with your data in binary form. Then, you start folding the bits into one another, losing half of the information that you had in the two bits. How do you combine the bits? When you start with bits A and B, you may choose one of the seven operations shown in Table 1 to condense the information.
折叠位时,您将从二进制格式的数据开始。 然后,您开始将位相互折叠,从而丢失了两位中一半的信息。 您如何组合位? 当您从位A和B开始时,可以选择表1所示的七个操作之一来压缩信息。
Upon first observation, one may notice several combinations missing. First, outputs of all zeros and all ones are not present. This result would zeroize the information, not condense it. We do not want to remove all of the information in the input bits; we just want to shrink the information. Second, we omit all of the inverses of these outputs because inverses convey the same information as one another.
首次观察时,可能会注意到缺少几种组合。 首先,不存在全零和全零的输出。 该结果将使信息归零,而不是压缩信息。 我们不想删除输入位中的所有信息。 我们只想缩小信息范围。 其次,我们忽略了这些输出的所有反函数,因为反函数相互传递相同的信息。
I wrote a short Python script that you can use to fold your bits. The first function is gen_param(size). This function generates random parameters for bit folding given the size of the input data that you want to fold. It returns two lists. The first list maps which bits to fold into which other bits, and the second gives the operations that you will use for each fold. We generate random parameters because the data is already too large, so you just need random fold parameters to help you shrink the data. Once you shrink the data and test it, you can compare randomly generated parameters to one another for improved performance. The second function, fold(value, new_size, mapping, ops), takes your parameters and returns your value folded into the size new_size.
我写了一个简短的Python脚本,您可以用它折叠位。 第一个函数是gen_param(size)。 给定您要折叠的输入数据的大小,此函数将为位折叠生成随机参数。 它返回两个列表。 第一个列表将要折叠的位映射为其他位,第二个列表给出了每次折叠将使用的操作。 我们生成随机参数是因为数据已经太大,因此您只需要随机折叠参数即可帮助您缩小数据。 收缩数据并对其进行测试后,可以将随机生成的参数相互比较以提高性能。 第二个函数fold(value,new_size,mapping,ops),获取您的参数并返回折叠为new_size大小的值。
If your dataset is too large and you are looking for ways to shrink the data, try out my program. I use comparison testing between two sets of parameters to find the better one and improve my algorithms. With data sets that are too large, bit folding gives you speed and wieldy data sizes in exchange for precision.
如果您的数据集太大,并且您正在寻找缩小数据的方法,请尝试我的程序。 我使用两组参数之间的比较测试来找到更好的参数并改进算法。 对于过大的数据集,位折叠可为您提供速度快而复杂的数据大小,以换取精度。
翻译自: https://medium.com/swlh/shrinking-big-data-with-bit-folding-4ea0aa6a055d
jquery数据折叠
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/388622.shtml
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!