================================================================================================
Benchmark to measure CSV read/write performance
================================================================================================

OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
AMD EPYC 7763 64-Core Processor
Parsing quoted values:                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
One quoted string                                 23353          23432          75          0.0      467067.4       1.0X

OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
AMD EPYC 7763 64-Core Processor
Wide rows with 1000 columns:              Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Select 1000 columns                               56825          57244         679          0.0       56825.1       1.0X
Select 100 columns                                20482          20568          86          0.0       20481.7       2.8X
Select one column                                 16968          17000          36          0.1       16967.7       3.3X
count()                                            3366           3378          11          0.3        3366.4      16.9X
Select 100 columns, one bad input field           28347          28379          30          0.0       28346.6       2.0X
Select 100 columns, corrupt record field          32401          32450          42          0.0       32401.2       1.8X

OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
AMD EPYC 7763 64-Core Processor
Count a dataset with 10 columns:          Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Select 10 columns + count()                       11174          11195          18          0.9        1117.4       1.0X
Select 1 column + count()                          7666           7694          24          1.3         766.6       1.5X
count()                                            2042           2048           5          4.9         204.2       5.5X

OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
AMD EPYC 7763 64-Core Processor
Write dates and timestamps:               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Create a dataset of timestamps                      854            882          27         11.7          85.4       1.0X
to_csv(timestamp)                                  6166           6174          13          1.6         616.6       0.1X
write timestamps to files                          6480           6575         158          1.5         648.0       0.1X
Create a dataset of dates                           948            949           1         10.6          94.8       0.9X
to_csv(date)                                       4471           4474           3          2.2         447.1       0.2X
write dates to files                               4599           4616          15          2.2         459.9       0.2X

OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
AMD EPYC 7763 64-Core Processor
Read dates and timestamps:                                             Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
-----------------------------------------------------------------------------------------------------------------------------------------------------
read timestamp text from files                                                  1200           1213          12          8.3         120.0       1.0X
read timestamps from files                                                     11576          11601          22          0.9        1157.6       0.1X
infer timestamps from files                                                    23234          23253          16          0.4        2323.4       0.1X
read date text from files                                                       1115           1162          44          9.0         111.5       1.1X
read date from files                                                           10978          11006          43          0.9        1097.8       0.1X
infer date from files                                                          22588          22604          13          0.4        2258.8       0.1X
timestamp strings                                                               1224           1236          21          8.2         122.4       1.0X
parse timestamps from Dataset[String]                                          13566          13595          41          0.7        1356.6       0.1X
infer timestamps from Dataset[String]                                          25057          25094          36          0.4        2505.7       0.0X
date strings                                                                    1618           1626           7          6.2         161.8       0.7X
parse dates from Dataset[String]                                               12784          12816          34          0.8        1278.4       0.1X
from_csv(timestamp)                                                            12008          12088          69          0.8        1200.8       0.1X
from_csv(date)                                                                 11930          11938          12          0.8        1193.0       0.1X
infer error timestamps from Dataset[String] with default format                14366          14394          35          0.7        1436.6       0.1X
infer error timestamps from Dataset[String] with user-provided format          14380          14412          52          0.7        1438.0       0.1X
infer error timestamps from Dataset[String] with legacy format                 14439          14453          21          0.7        1443.9       0.1X

OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
AMD EPYC 7763 64-Core Processor
Filters pushdown:                         Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
w/o filters                                        4302           4383         137          0.0       43020.6       1.0X
pushdown disabled                                  4206           4220          13          0.0       42058.8       1.0X
w/ filters                                          776            784          10          0.1        7756.3       5.5X


