java – 用于阅读的Apache POI Streaming(SXSSF)
内容导读
互联网集市收集整理的这篇技术教程文章主要介绍了java – 用于阅读的Apache POI Streaming(SXSSF),小编现在分享给大家,供广大互联网技能从业者学习和参考。文章包含2887字,纯文字阅读大概需要5分钟。
内容图文
![java – 用于阅读的Apache POI Streaming(SXSSF)](/upload/InfoBanner/zyjiaocheng/708/27d47aeb9f5e44edac0478df55058cff.jpg)
我需要读取大型excel文件并将其数据导入我的应用程序.
由于POI需要大量的堆来工作,经常抛出OutOfMemory错误,我发现有一个Streaming API用于以串行方式处理excel数据(而不是将文件完全加载到内存中)
我创建了一个xlsx工作簿,只有一个工作表,并在单元格中输入了几个值,并提供了以下代码来尝试读取它:
public static void main(String[] args) throws Throwable {
// keep 100 rows in memory, exceeding rows will be flushed to disk
SXSSFWorkbook wb = new SXSSFWorkbook(new XSSFWorkbook(new FileInputStream("C:\\test\\tst.xlsx")));
SXSSFSheet sheet = (SXSSFSheet) wb.getSheetAt(0);
Row row = sheet.getRow(0);
//row is always null
while(row.iterator().hasNext()){ //-> NullPointerException
System.out.println(row.getCell(0).getStringCellValue());
}
}
但是,尽管能够正确获取其工作表,但它总是带有空(空)行.
我已经研究并在互联网上找到了几个Streaming API的例子,但没有一个是关于读取现有文件的,它们都是关于生成excel文件的.
实际上是否可以从流中的现有.xlsx文件中读取数据?
解决方法:
在挖掘了更多之后,我发现了这个library:
If you’ve used Apache POI in the past to read in Excel files, you probably noticed that it’s not very memory efficient. Reading in an entire workbook will cause a severe memory usage spike, which can wreak havoc on a server.
There are plenty of good reasons for why Apache has to read in the whole workbook, but most of them have to do with the fact that the library allows you to read and write with random addresses. If (and only if) you just want to read the contents of an Excel file in a fast and memory effecient way, you probably don’t need this ability. Unfortunately, the only thing in the POI library for reading a streaming workbook requires your code to use a SAX-like parser. All of the friendly classes like Row and Cell are missing from that API.
This library serves as a wrapper around that streaming API while preserving the syntax of the standard POI API. Read on to see if it’s right for you.
InputStream is = new FileInputStream(new File("/path/to/workbook.xlsx"));
StreamingReader reader = StreamingReader.builder()
.rowCacheSize(100) // number of rows to keep in memory (defaults to 10)
.bufferSize(4096) // buffer size to use when reading InputStream to file (defaults to 1024)
.sheetIndex(0) // index of sheet to use (defaults to 0)
.sheetName("sheet1") // name of sheet to use (overrides sheetIndex)
.read(is); // InputStream or File for XLSX file (required)
还有SAX Event API,它读取文档并通过事件解析其内容.
If memory footprint is an issue, then for XSSF, you can get at the underlying XML data, and process it yourself. This is intended for intermediate developers who are willing to learn a little bit of low level structure of .xlsx files, and who are happy processing XML in java. Its relatively simple to use, but requires a basic understanding of the file structure. The advantage provided is that you can read a XLSX file with a relatively small memory footprint.
内容总结
以上是互联网集市为您收集整理的java – 用于阅读的Apache POI Streaming(SXSSF)全部内容,希望文章能够帮你解决java – 用于阅读的Apache POI Streaming(SXSSF)所遇到的程序开发问题。 如果觉得互联网集市技术教程内容还不错,欢迎将互联网集市网站推荐给程序员好友。
内容备注
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 gblab@vip.qq.com 举报,一经查实,本站将立刻删除。
内容手机端
扫描二维码推送至手机访问。