Giter Site home page Giter Site logo

pelletier197 / kotlin-stream-csv Goto Github PK

View Code? Open in Web Editor NEW
7.0 1.0 2.0 312 KB

A kotlin and java CSV parser that uses the power of kotlin DSL to simplify parsing and error handling compared to existing solutions

License: MIT License

Kotlin 97.47% Java 2.53%

kotlin-stream-csv's Introduction

CircleCI Coverage Status Maven Central

Kotlin Stream CSV

A pure kotlin implementation of the CSV parser. This implementation uses the simplicity and the power of Kotlin, while remaining compatible with Java source code. It is completely stream driven to maximize performance and flexibility.

Yet another CSV library?

This project started after facing an issue with regular CSV parsers: they throw errors midway when there is an invalid input in the CSV. This can cause frustration when you're in the situation where you want to compute all errors in the CSV and return them to the client, or even just ignore invalid inputs.

This library uses rather a lazy error handling approach. This means that if the input is not parsable, it returns a result containing the errors for the input, and you can decide what to do with that error.

Characteristics

  1. Collect errors as it goes - you can customize how you handle each specific error, instead of throwing an exception on the first one
  2. Easy to configure
  3. Everything is immutable
  4. Extremely lightweight (No dependencies).
  5. Kotlin ❤️

Usage

Three types of parsers are available:

  • Typed CSV parser will read your CSV file directly into a data class
  • Header CSV parser will return a Map<String, String> for each row, where the map's key is the header's value
  • Raw CSV reader will return a List<String> for every line of the CSV

For advanced configuration examples of all three types of CSV, see example project

Typed CSV parser

Probably the most useful implementation of all three of CSV parser for most use-cases. All examples under are based on the followed CSV:

first_name,last_name,phone_number  ,emails
John      ,Doe      ,1+342-534-2342,"[email protected], [email protected]"
Alice     ,Doe      ,1+423-253-3453,[email protected] 

Basic usage

data class CsvPerson(
    // Csv property allows specifying what is the header name in the CSV, while naming you class field how you wish
    @CsvProperty("first_name")
    val firstName: String,
    @CsvProperty("last_name")
    val lastName: String,
    @CsvProperty("phone_number")
    val phoneNumber: String,
    val emails: Set<String>
)

val reader = CsvReaders
    .forType<CsvPerson>()
val people = reader.read(csv).map { it.getResultOrThrow() }.toList()

println(people.joinToString(separator = "\n"))
// Output:
// CsvPerson(firstName=John, lastName= Doe, phoneNumber= 1+342-534-2342, emails=[[email protected], [email protected]])
// CsvPerson(firstName=Alice, lastName= Doe, phoneNumber= 1+423-253-3453, emails=[ [email protected] ])

Error handling

It is fairly simple to handle errors of a CSV input

// Missing emails field
val invalidCsv =
    """
        first_name, last_name, phone_number, emails 
        John, Doe, 1+342-534-2342
    """.trimIndent()

val reader = CsvReaders
    .forType<CsvPerson>()
    .withEmptyStringsAsNull(true)

reader.read(invalidCsv).forEach { println(it) }
// Output:
// TypedCsvLine(result=null, line=2, errors=[CsvError(csvField=emails, classField=emails, providedValue=null, type=NON_NULLABLE_FIELD_IS_NULL, cause=null)])
Output fields description
Field Description
result always non-null if there are no errors. It means the line was parsed successfully
errors[].csvField The field in the CSV that is missing
errors[].classField The field in the recipient class that is missing. Will differ from csvField if @CsvProperty is used.
errors[].providedValue The value provided in the CSV that caused this error. Will be null if the error is NON_NULLABLE_FIELD_IS_NULL
errors[].type The error type. See error type descriptions
errors[].cause The root exception that caused the error, if there is one
Error type descriptions
Error type Description
NON_NULLABLE_FIELD_IS_NULL In a Kotlin data class, this occurs when target field is of non-nullable type but provided value is null
NO_CONVERTER_FOUND_FOR_VALUE When trying to convert a value to a field class that has no converter. You should register a custom converter to support this field
CONVERSION_OF_FIELD_FAILED When trying to convert a field and the converter throws an exception.

Custom converters

When you want to map the CSV field to a custom object of yours, it is possible to do so by registering a custom converter for this field.

class EmailConverter : Converter<String, Email> {
    override val source: Class<String> get() = String::class.java
    override val target: Class<Email> get() = Email::class.java

    override fun convert(value: String, to: Type, context: ConversionContext): Email {
        return Email(value = value)
    }
}

val reader = CsvReaders
    .forType<CustomCsvPerson>()
    .withConverter(EmailConverter())

val people = reader.read(csv).map { it.getResultOrThrow() }.toList()
// Output:
// CustomCsvPerson(firstName=John, lastName= Doe, emails=[Email([email protected]), Email(value= [email protected])])
// CustomCsvPerson(firstName=Alice, lastName= Doe, emails=[Email(value= [email protected] )])

Header CSV Parser

This kind of CSV parser can also be useful if you don't know exactly the input format of the CSV, or for other sorts of reason. This parser uses the first non-empty line of the CSV as the header if you don't provide one programmatically.

Basic usage

    val reader = CsvReaders
    .header()
    .withHeader("first_name", "last_name", "phone_number", "emails") // If you wish to provide the header yourself
val people = reader.read(csv).map { it }.toList()

println(people.joinToString(separator = "\n"))
// Output:
// HeaderCsvLine(values={first_name=John, last_name= Doe, phone_number= 1+342-534-2342, [email protected], [email protected]}, line=2)
// HeaderCsvLine(values={first_name=Alice, last_name= Doe, phone_number= 1+423-253-3453, emails= [email protected] }, line=3)

Raw CSV parser

This last one is the low level parser that returns every raw line in the CSV as it comes.

Basic usage

    val reader = CsvReaders.raw()
val people = reader.read(csv).map { it }.toList()

println(people.joinToString(separator = "\n"))
// Output:
// RawCsvLine(columns=[first_name,  last_name,  phone_number,  emails], line=1)
// RawCsvLine(columns=[John,  Doe,  1+342-534-2342, [email protected], [email protected]], line=2)
// RawCsvLine(columns=[Alice,  Doe,  1+423-253-3453,  [email protected] ], line=3)

Configuration

Configuration is extremely simple and versatile. Every configuration change will create a new immutable parser to avoid side effects. Here are the available configuration for the different parsers that are available through an explicit method name on the parser.

Configuration Definition Default Typed Header Raw
Separator The separator to use for the columns. For now, a single character can be used as a separator. , ✔️ ✔️ ✔️
Delimiter The quoted column delimiter, when you want to use the separator inside a column. " ✔️ ✔️ ✔️
Trim entries Either to trim entries or not when parsing this input. false ✔️ ✔️ ✔️
Skip empty lines Either to skip the empty lines or not. true ✔️ ✔️ ✔️
Empty strings as null Either to treat empty strings as null when parsing the columns. false ✔️ ✔️ ✔️
Encoding Change the input's encoding. Platform Default ✔️ ✔️ ✔️
Header Allows setting the header of the parser. When not configured, first non-empty line is used as the header. null ✔️ ✔️
List separator The character to use when converting a string to a collection (list, set) , ✔️

RFC-4180

This implementation respects all specifications of RFC-4180.

kotlin-stream-csv's People

Contributors

pelletier197 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

kotlin-stream-csv's Issues

Use with Kotlin Kover results in InvalidTargetClass

Use of this library in conjunction with Kotlin Kover results in InvalidTargetClass exception being raised. This appears to be due to syntactic sugar created by Kover on the target classes.

io.github.pelletier197.csv.reader.reflect.InvalidTargetClass:
Invalid recipient class: 'oftr.data.CsvMember'. Recipient class is expected to have a public constructor for all the public field of the target class and no other parameters.

>> If the constructor exists and is public, it is possible that constructor's fields name are lost at compile time, which makes it impossible to find field's parameter names. With Java 8+, you can enable JVM option to conserve field name values with compile option '-parameters'. 


at io.github.pelletier197.csv.reader.reflect.InstanceCreator.getTargetConstructor(Reflection.kt:168)
at io.github.pelletier197.csv.reader.reflect.InstanceCreator.createInstance(Reflection.kt:83)
at io.github.pelletier197.csv.reader.reflect.CsvReflectionCreator.createCsvInstance(CsvReflectionCreator.kt:62)
at io.github.pelletier197.csv.reader.types.TypedCsvReader.readHeaderLines$lambda-1(TypedCsvParser.kt:100)
at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
at io.github.pelletier197.csv.reader.parser.CsvLineParser$createSplitIterator$1.tryAdvance(CsvLineParser.kt:60)
at java.base/java.util.Spliterator.forEachRemaining(Spliterator.java:326)
at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
at oftr.CsvMemberDataLoader.loadMembers(CsvMemberDataLoader.kt:61)
at oftr.CsvMemberDataLoaderTest.testCsvDataLoader(CsvDataLoaderTest.kt:35)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:725)
at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
at org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
at org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:149)
at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:140)
at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:84)
at org.junit.jupiter.engine.execution.ExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(ExecutableInvoker.java:115)
at org.junit.jupiter.engine.execution.ExecutableInvoker.lambda$invoke$0(ExecutableInvoker.java:105)
at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
at org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
at org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
at org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:104)
at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:98)
at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:214)
at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:210)
at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:135)
at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:66)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)
at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.submit(SameThreadHierarchicalTestExecutorService.java:35)
at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:57)
at org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:54)
at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108)
at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88)
at org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54)
at org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67)
at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52)
at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96)
at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:75)
at org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.processAllTestClasses(JUnitPlatformTestClassProcessor.java:99)
at org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.access$000(JUnitPlatformTestClassProcessor.java:79)
at org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor.stop(JUnitPlatformTestClassProcessor.java:75)
at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.stop(SuiteTestClassProcessor.java:61)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:33)
at org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:94)
at com.sun.proxy.$Proxy2.stop(Unknown Source)
at org.gradle.api.internal.tasks.testing.worker.TestWorker.stop(TestWorker.java:135)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:182)
at org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:164)
at org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:414)
at org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:64)
at org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:48)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at org.gradle.internal.concurrent.ThreadFactoryImpl$ManagedThreadRunnable.run(ThreadFactoryImpl.java:56)
at java.base/java.lang.Thread.run(Thread.java:829)

Enable properties to be ignored

Sometimes in data classes it's useful to have a calculated property that's derived based on other values. Currently kotlin-stream-csv doesn't know to ignore these when reading/parsing the source CSV file. It would be great to enhance kotlin-stream-csv so that it automagically knows to ignore these scenarios, or alternatively they can be explicitly tagged as fields to ignore.

Here's an example of a data class with the "fullName" property that is not present in the source CSV file.

data class Member(

@CsvProperty("MemberID")
val membershipId: String,

@CsvProperty("FirstName")
val firstName: String,

@CsvProperty("LastName")
val lastName: String,

// we want to ignore this property when parsing csv
val fullName: String = "$firstName $lastName"

)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.