如何在Java或Python中使用文件系统缓存?
内容导读
互联网集市收集整理的这篇技术教程文章主要介绍了如何在Java或Python中使用文件系统缓存?,小编现在分享给大家,供广大互联网技能从业者学习和参考。文章包含2795字,纯文字阅读大概需要4分钟。
内容图文
![如何在Java或Python中使用文件系统缓存?](/upload/InfoBanner/zyjiaocheng/779/eeeb55ba3a224b309c9e14937c06c84d.jpg)
一个recent blog post on Elasticsearch网站正在讨论他们新的1.4测试版的功能.
我很好奇他们如何使用文件系统缓存:
Recent releases have added support for doc values. Essentially, doc values provide the same function as in-memory fielddata, but they are written to disk at index time. The benefit that they provide is that they consume very little heap space. Doc values are read from disk, instead of from memory. While disk access is slow, doc values benefit from the kernel’s filesystem cache. The filesystem cache, unlike the JVM heap, is not constrained by the 32GB limit. By shifting fielddata from the heap to the filesystem cache, you can use smaller heaps which means faster garbage collections and thus more stable nodes.
Before this release, doc values were significantly slower than in-memory fielddata. The changes in this release have improved the performance significantly, making them almost as fast as in-memory fielddata.
这是否意味着我们可以操纵文件系统缓存的行为而不是被动地等待操作系统的影响?如果是这种情况,我们如何在正常的应用程序开发中使用文件系统缓存?说,如果我正在编写Python或Java程序,我该怎么做?
解决方法:
文件系统缓存是与OS内部工作相关的实现细节,对最终用户是透明的.它不是需要调整或改变的东西. Lucene在管理索引段时已经使用了文件系统缓存.每次将某些内容索引到Lucene(通过Elasticsearch)时,这些文档都会写入段,这些段首先写入文件系统缓存,然后在一段时间后(当translog – 一种跟踪文档被索引的方式 – 是例如,完整的缓存内容被写入实际文件.但是,虽然要编制索引的文档位于文件系统缓存中,但仍可以访问它们.
doc值实现的这种改进将此功能称为能够立即使用文件系统缓存,因为它们从磁盘读取,放入缓存并从那里访问,而不是占用堆空间.
如何访问此文件系统缓存在this excellent blog post中描述:
In our previous approaches, we were relying on using a syscall to copy the data between the file system cache and our local Java heap. How about directly accessing the file system cache? This is what mmap does!
Basically mmap does the same like handling the Lucene index as a swap file. The mmap() syscall tells the O/S kernel to virtually map our whole index files into the previously described virtual address space, and make them look like RAM available to our Lucene process. We can then access our index file on disk just like it would be a large byte[] array (in Java this is encapsulated by a ByteBuffer interface to make it safe for use by Java code). If we access this virtual address space from the Lucene code we don’t need to do any syscalls, the processor’s MMU and TLB handles all the mapping for us. If the data is only on disk, the MMU will cause an interrupt and the O/S kernel will load the data into file system cache. If it is already in cache, MMU/TLB map it directly to the physical memory in file system cache.
与在Java程序中使用mmap的实际方法相关,我认为是this is the class and method to do so.
内容总结
以上是互联网集市为您收集整理的如何在Java或Python中使用文件系统缓存?全部内容,希望文章能够帮你解决如何在Java或Python中使用文件系统缓存?所遇到的程序开发问题。 如果觉得互联网集市技术教程内容还不错,欢迎将互联网集市网站推荐给程序员好友。
内容备注
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 gblab@vip.qq.com 举报,一经查实,本站将立刻删除。
内容手机端
扫描二维码推送至手机访问。