Open
Description
Reproduction:
@Test
fun foo() {
data class TimeZonesTest(val with_timezone_offset: Instant, val without_timezone_offset: LocalDateTime)
val csvContent =
"""
with_timezone_offset,without_timezone_offset
2024-12-12T13:00:00+01:00,2024-12-12T13:00:00
""".trimIndent()
val df = DataFrame.readCsv(
csvContent.byteInputStream(),
// colTypes = mapOf("with_timezone_offset" to ColType.Instant) // *1
// parserOptions = ParserOptions(dateTimeFormatter = ISO_OFFSET_DATE_TIME), // *2
)
println(df)
println(df.schema())
val parsed = df.toListOf<TimeZonesTest>().first()
assertEquals(Instant.parse("2024-12-12T13:00:00+01:00"), parsed.with_timezone_offset)
}
This outputs:
with_timezone_offset without_timezone_offset
0 2024-12-12T12:00 2024-12-12T13:00
with_timezone_offset: kotlinx.datetime.LocalDateTime
without_timezone_offset: kotlinx.datetime.LocalDateTime
org.opentest4j.AssertionFailedError:
Expected :2024-12-12T12:00:00Z
Actual :2024-12-12T11:00:00Z
Changing the dateTimeFormatter (*2) has no effect on the test outcome.
Explicitly telling the parser to parse as Instant (*1) fixes the issue.
However, following the principle of least surprise IMHO it would be A LOT better to:
- either have the conversion fail because the colum was processed as LocalDateTime, hence has no timezone information, and hence cannot be converted into an instant.
- or automagically detect, that there is a timezone in the data, and directly parse the column as Instant.