Releases: nao1215/filesql
Releases · nao1215/filesql
v0.12.0
Added
- Fedwire (Legacy Wire) File Support (e7e2189): Complete legacy Fedwire message file support (Experimental)
- File format: Tag-value text format (
.fed) used by the Federal Reserve's large-value real-time gross settlement system - Flat table structure: All FEDWireMessage fields (~326 columns) flattened into a single
{filename}_messagetable with 1 row per file - All fields as TEXT: Wire format stores amounts as fixed-width strings; all columns use
TEXTtype to preserve formatting - Full round-trip support: Parse → query/modify via SQL → export back to valid
.fedformat - Registry-backed TableSet:
registerWireTableSet/getWireTableSet/UnregisterWireTableSet/ClearWireTableSetRegistryfor managing original Wire structures needed for round-trip export WireTableInfostruct: ProvidesMessageTable()andAllTableNames()methods for programmatic table name discoveryGetWireTableInfos(): Returns[]WireTableInfofor all registered Fedwire filesIsWireBaseTableName(): Checks if a table name matches the Fedwire_messagesuffix conventionDumpFedWire()/DumpFedWireWithTableSet(): Export Fedwire tables from database back to.fedfilesOutputFormatFedWire: New output format enum for auto-save and dump operationsErrWire: Sentinel error for Fedwire operation failuresFileTypeFedWire: New file type constant withString(),extension(),baseType()support- Streaming support:
streamFedWireFileToDatabase()for file-path input,streamWireFileToDatabase()forio.Readerinput - AddFS support: Fedwire files in
fs.FS(includingembed.FS) are properly detected and loaded - Auto-save integration:
performFedWireAutoSave()andoverwriteOriginalFiles()handle.fedfiles alongside ACH and tabular formats - Test coverage: Unit tests for file detection, parsing, registry, SQL queries, round-trip, and export
- File format: Tag-value text format (
Fixed
- Windows file lock in AddFS (e7e2189): Added
closer: filetoreaderInputinfile_processor.goso that FS-opened files are properly closed after streaming. Previously,TempDir RemoveAllcleanup failed on Windows because files remained open - Dead code removal in builder.go (e7e2189): Removed unused
processFSToReadersmethod (~85 lines) anddeduplicateCompressedFileswrapper fromDBBuilder. The actual code path usesfileProcessor.processFSToReaders(). Updatedbuilder_test.goaccordingly
Changed
- Documentation Updates: Added Fedwire Support sections to all README files (7 languages: EN, ES, FR, JA, KO, RU, ZH-CN)
- Supported formats table updated with
.fedextension - Experimental status warning, table structure explanation, TEXT column rationale, limitations, security considerations, and code examples
- Supported formats table updated with
- Auto-save cleanup refactoring: Renamed
cleanupACHRegistry()tocleanupTableSetRegistries()to handle both ACH and Fedwire registry cleanup on connection close - ACH table detection improvement:
dumpSQLiteDatabase()now verifies registry presence before treating_messagesuffix tables as ACH, preventing false positives with Fedwire's_messagetables
Dependencies
github.com/nao1215/fileparser: v0.4.0 → v0.5.1 (adds Wire subpackage for Fedwire parsing)github.com/moov-io/wire: v0.15.7 (new indirect dependency via fileparser)
v0.11.0
Added
- JSON / JSONL File Support (e45329d): Complete JSON and JSON Lines file format support with SQLite
json_extract()integration- JSON format: Array root → one row per element, Object root → single row. Raw JSON stored in
data TEXTcolumn - JSONL format: One row per line with
bufio.Reader(no line size limit). Empty lines silently skipped, invalid lines rejected with line number - Query with
json_extract(): Access nested fields via SQLite's built-in JSON functions- Example:
SELECT json_extract(data, '$.name') FROM my_table - Example:
SELECT json_extract(data, '$.address.city') FROM my_table
- Example:
- Compression support: All 8 compression formats supported for both JSON and JSONL (
.json.gz,.json.bz2,.json.xz,.json.zst,.json.z,.json.snappy,.json.s2,.json.lz4, and corresponding.jsonl.*variants) - Streaming chunk processing:
processJSONInChunksusesjson.Decoderto stream array elements one at a time without loading the entire array into memory, preventing OOM for large JSON files.processJSONLInChunksprovides true line-by-line streaming - Trailing data validation: Rejects malformed JSON with trailing garbage after array (e.g.,
[{"a":1}] garbage) - 18 new
FileTypeconstants:FileTypeJSON,FileTypeJSONL, plus 16 compressed variants - Test coverage: Unit tests, integration tests with
json_extract()queries, compressed format tests (gzip, zstd, snappy, s2, lz4, zlib), 85.0% overall coverage
- JSON format: Array root → one row per element, Object root → single row. Raw JSON stored in
Fixed
- Missing
parserFileTypemappings (e45329d): Added missing zlib, snappy, s2, lz4 mappings for CSV, TSV, LTSV, Parquet, XLSX inparser_bridge.go. Previously these compressed variants would fall through toUnsupported - Pre-existing lint issues: Fixed
preallocwarnings inbuilder.goandfile.go, removed unusedcolTypeparameter fromnewColumnInfoWithType
Changed
- Documentation Updates: Added JSON/JSONL sections to all README files (7 languages: EN, ES, FR, JA, KO, RU, ZH-CN)
- Supported formats table updated with
.jsonand.jsonlbase formats and all compression variants - Usage examples with
json_extract()queries for flat and nested JSON structures
- Supported formats table updated with
- Test coverage for
parserFileType: Added 38 test cases covering ZLIB, SNAPPY, S2, LZ4 for all existing formats plus all 18 JSON/JSONL mappings
Dependencies
github.com/nao1215/fileparser: v0.3.1 → v0.4.0 (adds JSON/JSONL parsing support)modernc.org/sqlite: 1.40.1 → 1.45.0github.com/klauspost/compress: 1.18.2 → 1.18.4github.com/pierrec/lz4/v4: 4.1.22 → 4.1.25
v0.10.0
Added
- Custom Logger Support: Flexible logging system with slog integration
Loggerinterface: Simple logging interface withDebug,Info,Warn,Error, andWithmethodsContextLoggerinterface: Extended logging interface with context-aware methods (DebugContext,InfoContext,WarnContext,ErrorContext)NewSlogAdapter(): Adapter to use standard libraryslog.Loggerwith filesql'sLoggerinterfaceNewSlogContextAdapter(): Adapter for context-aware logging withslog.LoggerWithLogger(): Builder method to inject custom logger into the build and open processnopLogger: Zero-overhead no-op logger implementation used as default (benchmarked at ~0.2 ns/op)- Logging throughout build, validation, and database opening operations
- Comprehensive test coverage and benchmarks for all logger implementations
Changed
- Documentation Updates: Added Custom Logger section to all README files (7 languages: EN, ES, FR, JA, KO, RU, ZH-CN)
- Usage examples with slog integration
- Logger and ContextLogger interface definitions
- Performance benchmark comparison table
v0.9.0
Added
-
Read-Only Database Mode: New
ReadOnlyDBwrapper for safe read-only access to databasesNewReadOnlyDB(db): Wraps existing*sql.DBto prevent write operationsReadOnlyDB.Query(),QueryContext(),QueryRow(),QueryRowContext(): Read operations work normallyReadOnlyDB.Exec(),ExecContext(): ReturnsErrReadOnlyfor write operations (INSERT, UPDATE, DELETE, DROP, ALTER, CREATE, TRUNCATE, REPLACE, UPSERT)ReadOnlyDB.Prepare(),PrepareContext(): Rejects preparation of write statementsReadOnlyDB.Begin(),BeginTx(): ReturnsReadOnlyTxfor read-only transactionsReadOnlyDB.Ping(),PingContext(),Close(),DB(): Standard database operationsReadOnlyStmt: Read-only prepared statement wrapperReadOnlyTx: Read-only transaction wrapper with same protectionsDBBuilder.OpenReadOnly(ctx): Convenience method to open database in read-only modeErrReadOnly: Sentinel error for rejected write operations- Useful for audit scenarios where data viewing without modification risk is required
-
ACHTableInfo Struct: New struct for managing ACH table name information
ACHTableInfo.BaseName: The base table name derived from ACH filenameACHTableInfo.FileHeaderTable(): Returns{baseName}_file_headerACHTableInfo.BatchesTable(): Returns{baseName}_batchesACHTableInfo.EntriesTable(): Returns{baseName}_entriesACHTableInfo.AddendaTable(): Returns{baseName}_addendaACHTableInfo.IATBatchesTable(): Returns{baseName}_iat_batchesACHTableInfo.IATEntriesTable(): Returns{baseName}_iat_entriesACHTableInfo.IATAddendaTable(): Returns{baseName}_iat_addendaACHTableInfo.AllTableNames(): Returns all possible table names for the base nameGetACHTableInfos(): Returns[]ACHTableInfofor all registered ACH files
Changed
- Internal ACH Function: Made
GetACHBaseTableNamesprivate (getACHBaseTableNames) as it was only used internally- Use
GetACHTableInfos()for public access to ACH table information
- Use
v0.8.0
Added
- New Compression Formats: Added support for 4 new compression formats via fileparser v0.2.0
- zlib (.z) - Standard DEFLATE compression
- snappy (.snappy) - Google's high-speed compression
- s2 (.s2) - Improved Snappy extension, faster
- lz4 (.lz4) - Extremely fast compression
v0.7.0
Changed
- Migrated from internal
github.com/nao1215/filesql/parserto externalgithub.com/nao1215/fileparserfor file parsing - Updated all internal references from
parser.tofileparser.
Removed
- Internal
parserpackage (now usinggithub.com/nao1215/fileparser v0.1.0as external dependency)
v0.6.0
Added
- Public Parser Package (6271e5ef): Exposed the internal parser as a public API for use in external projects
- New
parserpackage: Standalone file parsing without SQLite dependencyparser.Parse(): Parse CSV, TSV, LTSV, XLSX, and Parquet files fromio.Readerparser.DetectFileType(): Automatic file type detection from file pathparser.BaseFileType(): Get base file type from potentially compressed file types
- Type exports:
TableData,ColumnType,FileTypetypes for working with parsed data - Parquet support: Full Parquet parsing with
parser/parquet.go - XLSX support: Excel file parsing with
parser/xlsx.go - Comprehensive test coverage: 90%+ coverage for the parser package
- New
- ORM Integration Examples (281ede2): Added example code for popular Go ORMs and query builders
- GORM: Full GORM integration example with model definitions
- Bun: Bun ORM example with struct scanning
- Ent: Facebook's Ent framework example with generated code
- sqlx: sqlx example with struct tags
- sqlc: sqlc example with generated type-safe queries
- Squirrel: Squirrel query builder example
- Basic: Standard library database/sql example
- Multi-format: Example combining CSV, TSV, and LTSV files
- FileType.String() Method: Added
fmt.Stringerimplementation forFileTypeenum- Human-readable format names for logging and debugging
- Returns names like "CSV", "TSV", "LTSV", "XLSX", "Parquet", etc.
Changed
- Documentation Updates: Enhanced README files across all 7 languages
Technical Details
- Architecture: Parser package enables lightweight file parsing without database overhead
- Compatibility: Parser package can be used independently of the main filesql package
- Testing: Added comprehensive test suites for parser, types, and error handling
v0.5.0
Added
- Benchmark Tests (2852ea2): Added benchmark infrastructure for performance testing
- New
make benchmarktarget in Makefile for running benchmark tests - Benchmark tests isolated with
//go:build benchmarktag to prevent execution during regular tests BenchmarkOpenContextandBenchmarkOpenContextParallelfor measuring CSV loading performance
- New
Improved
- Major Performance Optimization (d20b3c8, e95a5bf): Significantly improved file loading performance
- 55% faster execution: Reduced 100,000-row CSV loading time from ~960ms to ~430ms
- 12% less memory: Reduced memory usage from ~161MB to ~141MB
- Transaction batching: Wrapped all INSERT operations in a single transaction to reduce SQLite disk sync operations
- Slice reuse: Pre-allocate and reuse value slices in
insertChunkData()to reduce allocations - Pre-allocation in type inference: Optimized
newColumnInfoList()andinferColumnsInfo()with pre-allocated column value slices
Fixed
- Data Integrity in Chunk Insertion (b191d93): Fixed potential data corruption issues in
insertChunkData()- Stale value prevention: Fixed issue where records with fewer columns than headers could retain stale values from previous rows
- Extra column detection: Added validation to fail fast when records have more columns than headers, preventing silent data truncation
Changed
- Documentation Updates (17a42fa): Added benchmark results to all README files (7 languages)
- Performance metrics: ~430ms execution time, ~141MB memory for 100,000-row CSV
Dependencies
github.com/klauspost/compress: 1.18.1 → 1.18.2
v0.4.6
Added
- Header-Only File Support (PR #67, 5de8801): Files with headers but no data records are now supported
- CSV, TSV, Parquet, and XLSX formats can now be loaded with only header rows
- Creates empty SQLite tables with correct column names (all columns as TEXT type)
- Useful for schema definition files and template files
- Example: A CSV file containing only
id,name,agewill create a table with those columns but zero rows
Fixed
- LTSV Error Handling: Improved error messages for invalid LTSV data
- Now correctly returns
"no valid LTSV keys found"error instead of silently creating empty tables - LTSV format requires
key:valuepairs, so header-only concept does not apply
- Now correctly returns
Changed
- Dependencies: Updated library dependencies
modernc.org/sqlite: 1.40.0 → 1.40.1github.com/klauspost/compress: 1.18.0 → 1.18.1github.com/xuri/excelize/v2: 2.9.1 → 2.10.0golang.org/x/crypto: Security updateactions/checkout: 4 → 6
v0.4.5
Fixed
- Table Name Sanitization: Fixed SQL syntax errors caused by special characters in file names
- Applied
sanitizeTableName()to all table name generation paths - Hyphens, spaces, and special characters are now automatically converted to underscores
- Example:
"user-data.csv"→ table"user_data","my file.csv"→ table"my_file" - Updated test expectations to match sanitized table names
- Applied
Improved
- API Documentation: Enhanced documentation for public APIs to clarify table name sanitization
- Updated
Open(),OpenContext(), andDBBuilder.Open()method documentation - Added examples showing special character conversion in table names
- Improved
sanitizeTableName()function documentation with detailed transformation rules
- Updated
- Development Experience: Optimized test execution time for local development
- Added GitHub Actions environment checks to skip slow tests locally
- Reduced local test execution time by 63% (from ~55s to ~20s)
- Maintained full test coverage in CI/CD while improving developer productivity
Technical Details
- Breaking Change Prevention: Preserved existing
tableFromFilePath()behavior for backward compatibility - Test Coverage: Maintained 80.7% test coverage with updated test expectations
- Performance: No impact on runtime performance, only development-time improvements