WeihanLi.Npoi 1.21.0 Released

Intro

WeihanLi.Npoi 是一个基于 netstandard2.0 的一个 NPOI 扩展库，主要用于导入导出 Excel 以及 CSV，支持通过 Fluent API 的方式来支持非常灵活的导入导出配置，详细使用可以参考文档介绍以及项目示例 https://github.com/WeihanLi/WeihanLi.Npoi

New Features

本次引入的新功能是针对 DataTable 的优化，如果导入的 Excel 出现了重复列，原来会直接抛出一个 System.Data.DuplicateNameException，主要是因为原来是直接用 excel 列名称作为 DataColumn 的 Name，而一个 DataTable 中是不允许有名字重复 Column 的，就像数据库同一个表中不允许出现重复列名一样。可以参考 Issue https://github.com/WeihanLi/WeihanLi.Npoi/issues/125

而导入 excel 时，很多时候可能并不根据列名称去读取对应的值，有时候会直接使用列索引来读取列的值，这个场景下，即使 excel 列名冲突了也关系不大，我们只需要按照索引读取就可以了，所以就考虑了支持冲突的读取，因为想再导出的时候 excel 还和之前导入的时候保持一致，所以也增加了导出的时候对 DataTable 的处理，实现效果可以参考单元测试：

// Csv
[Fact]
public void DuplicateColumnTest()
{var csvText = $@"A,B,C,A,B,C{Environment.NewLine}1,2,3,4,5,6";var dataTable = CsvHelper.ToDataTable(csvText.GetBytes());Assert.Equal(6, dataTable.Columns.Count);Assert.Equal(1, dataTable.Rows.Count);var newCsvText = CsvHelper.GetCsvText(dataTable);Assert.StartsWith("A,B,C,A,B,C", newCsvText);var newDataTable = CsvHelper.ToDataTable(newCsvText.GetBytes());Assert.Equal(dataTable.Columns.Count, newDataTable.Columns.Count);Assert.Equal(dataTable.Rows.Count, newDataTable.Rows.Count);
}
// Excel
[Theory]
[ExcelFormatData]
public void DuplicateColumnTest(ExcelFormat excelFormat)
{var workbook = ExcelHelper.PrepareWorkbook(excelFormat);var sheet = workbook.CreateSheet();var headerRow = sheet.CreateRow(0);headerRow.CreateCell(0).SetCellValue("A");headerRow.CreateCell(1).SetCellValue("B");headerRow.CreateCell(2).SetCellValue("C");headerRow.CreateCell(3).SetCellValue("A");headerRow.CreateCell(4).SetCellValue("B");headerRow.CreateCell(5).SetCellValue("C");var dataRow = sheet.CreateRow(1);dataRow.CreateCell(0).SetCellValue("1");dataRow.CreateCell(1).SetCellValue("2");dataRow.CreateCell(2).SetCellValue("3");dataRow.CreateCell(3).SetCellValue("4");dataRow.CreateCell(4).SetCellValue("5");dataRow.CreateCell(5).SetCellValue("6");var dataTable = sheet.ToDataTable();Assert.Equal(headerRow.Cells.Count, dataTable.Columns.Count);Assert.Equal(1, dataTable.Rows.Count);var newWorkbook = ExcelHelper.LoadExcel(dataTable.ToExcelBytes());var newSheet = newWorkbook.GetSheetAt(0);Assert.Equal(sheet.PhysicalNumberOfRows, newSheet.PhysicalNumberOfRows);for (var i = 0; i < sheet.PhysicalNumberOfRows; i++){Assert.Equal(sheet.GetRow(i).Cells.Count, newSheet.GetRow(i).Cells.Count);for (var j = 0; j < headerRow.Cells.Count; j++){Assert.Equal(sheet.GetRow(i).GetCell(j).GetCellValue<string>(),newSheet.GetRow(i).GetCell(j).GetCellValue<string>());}}
}

实现方式上一定程度参考了 issue 给出的建议，导入时重复列会添加一个 duplicate 标识和一个唯一 id 使得名称不会重复，从而不会引发异常，导出时如果是重复列会把 duplicate 标识和唯一 id 去掉从而还原真实的列名称，更多细节可以查看 Github 上的 PR https://github.com/WeihanLi/WeihanLi.Npoi/pull/126

Bug Fixes

修复了 sheet name 配置可能会不生效的 BUG

本次更新修复了在导出成文件的时候 sheet name 的配置没有生效的一个 BUG，详细可以参考 issue: https://github.com/WeihanLi/WeihanLi.Npoi/issues/127

开始并没有重现这个 BUG，因为只有在导出为文件的时候才会有问题，如果是 bytes 或者 stream 是不会有这个问题的，现在已经增加了下面的测试用例来覆盖这个情况

[Theory]
[ExcelFormatData]
public void SheetNameTest_ToExcelFile(ExcelFormat excelFormat)
{IReadOnlyList<Notice> list = Enumerable.Range(0, 10).Select(i => new Notice(){Id = i + 1,Content = $"content_{i}",Title = $"title_{i}",PublishedAt = DateTime.UtcNow.AddDays(-i),Publisher = $"publisher_{i}"}).ToArray();var settings = FluentSettings.For<Notice>();lock (settings){settings.HasSheetSetting(s =>{s.SheetName = "Test";});var filePath = $"{Path.GetTempFileName()}.{excelFormat.ToString().ToLower()}";list.ToExcelFile(filePath);var excel = ExcelHelper.LoadExcel(filePath);Assert.Equal("Test", excel.GetSheetAt(0).SheetName);settings.HasSheetSetting(s =>{s.SheetName = "NoticeList";});}}[Theory]
[ExcelFormatData]
public void SheetNameTest_ToExcelBytes(ExcelFormat excelFormat)
{IReadOnlyList<Notice> list = Enumerable.Range(0, 10).Select(i => new Notice(){Id = i + 1,Content = $"content_{i}",Title = $"title_{i}",PublishedAt = DateTime.UtcNow.AddDays(-i),Publisher = $"publisher_{i}"}).ToArray();var settings = FluentSettings.For<Notice>();lock (settings){settings.HasSheetSetting(s =>{s.SheetName = "Test";});var excelBytes = list.ToExcelBytes(excelFormat);var excel = ExcelHelper.LoadExcel(excelBytes, excelFormat);Assert.Equal("Test", excel.GetSheetAt(0).SheetName);settings.HasSheetSetting(s =>{s.SheetName = "NoticeList";});}
}

修复导出到文件 excel 文件格式不对的 BUG

根据文件路径创建 excel workbook 的时候原来是有 BUG 的可能会导致文件格式不对，原来没有先换取文件扩展名，新版本中修复了这个 bug，会先获取文件扩展名再判断文件格式

- !excelPath.EqualsIgnoreCase(".xls")
+ !Path.GetExtension(excelPath).EqualsIgnoreCase(".xls")

这个新版本中还有个针对 CsvHelper 的小优化，主要是获取导出的 CSV 字符串时 includeHeader 参数变成了一个可选参数，对于调用方来说可以调用会变得更简单一些，默认值是 true，默认会包含 header

public static string GetCsvText(this DataTable? dataTable, bool includeHeader = true);
public static string GetCsvText<TEntity>(this IEnumerable<TEntity> entities, bool includeHeader = true);
更多细节可以参考 PR 变更 https://github.com/WeihanLi/WeihanLi.Npoi/pull/130