================================================================================================
Benchmark to measure CSV read/write performance
================================================================================================

OpenJDK 64-Bit Server VM 21.0.6+7-LTS on Linux 6.8.0-1020-azure
AMD EPYC 7763 64-Core Processor
Parsing quoted values:                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
One quoted string                                 24351          24419          60          0.0      487014.8       1.0X

OpenJDK 64-Bit Server VM 21.0.6+7-LTS on Linux 6.8.0-1020-azure
AMD EPYC 7763 64-Core Processor
Wide rows with 1000 columns:              Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Select 1000 columns                               56834          57144         501          0.0       56834.4       1.0X
Select 100 columns                                21054          21095          55          0.0       21054.0       2.7X
Select one column                                 17523          17550          27          0.1       17522.9       3.2X
count()                                            3658           3676          25          0.3        3657.7      15.5X
Select 100 columns, one bad input field           25678          25832         245          0.0       25678.1       2.2X
Select 100 columns, corrupt record field          29027          29102          75          0.0       29026.6       2.0X

OpenJDK 64-Bit Server VM 21.0.6+7-LTS on Linux 6.8.0-1020-azure
AMD EPYC 7763 64-Core Processor
Count a dataset with 10 columns:          Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Select 10 columns + count()                       10832          10860          39          0.9        1083.2       1.0X
Select 1 column + count()                          7372           7399          27          1.4         737.2       1.5X
count()                                            1698           1706           8          5.9         169.8       6.4X

OpenJDK 64-Bit Server VM 21.0.6+7-LTS on Linux 6.8.0-1020-azure
AMD EPYC 7763 64-Core Processor
Write dates and timestamps:               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Create a dataset of timestamps                      864            867           2         11.6          86.4       1.0X
to_csv(timestamp)                                  6183           6192          11          1.6         618.3       0.1X
write timestamps to files                          6506           6512           7          1.5         650.6       0.1X
Create a dataset of dates                           961            962           2         10.4          96.1       0.9X
to_csv(date)                                       4597           4600           5          2.2         459.7       0.2X
write dates to files                               4608           4613           6          2.2         460.8       0.2X

OpenJDK 64-Bit Server VM 21.0.6+7-LTS on Linux 6.8.0-1020-azure
AMD EPYC 7763 64-Core Processor
Read dates and timestamps:                                             Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
-----------------------------------------------------------------------------------------------------------------------------------------------------
read timestamp text from files                                                  1311           1314           5          7.6         131.1       1.0X
read timestamps from files                                                     11583          11590           8          0.9        1158.3       0.1X
infer timestamps from files                                                    22995          23055          64          0.4        2299.5       0.1X
read date text from files                                                       1234           1276          37          8.1         123.4       1.1X
read date from files                                                           11216          11238          30          0.9        1121.6       0.1X
infer date from files                                                          22681          22718          35          0.4        2268.1       0.1X
timestamp strings                                                               1224           1227           2          8.2         122.4       1.1X
parse timestamps from Dataset[String]                                          13706          13760          83          0.7        1370.6       0.1X
infer timestamps from Dataset[String]                                          25170          25224          64          0.4        2517.0       0.1X
date strings                                                                    1698           1704           5          5.9         169.8       0.8X
parse dates from Dataset[String]                                               12766          12789          21          0.8        1276.6       0.1X
from_csv(timestamp)                                                            11607          11690          73          0.9        1160.7       0.1X
from_csv(date)                                                                 11353          11364          13          0.9        1135.3       0.1X
infer error timestamps from Dataset[String] with default format                14883          14927          46          0.7        1488.3       0.1X
infer error timestamps from Dataset[String] with user-provided format          14897          14928          38          0.7        1489.7       0.1X
infer error timestamps from Dataset[String] with legacy format                 14893          14931          45          0.7        1489.3       0.1X

OpenJDK 64-Bit Server VM 21.0.6+7-LTS on Linux 6.8.0-1020-azure
AMD EPYC 7763 64-Core Processor
Filters pushdown:                         Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
w/o filters                                        4227           4239          14          0.0       42274.8       1.0X
pushdown disabled                                  4259           4299          42          0.0       42592.2       1.0X
w/ filters                                          741            746           4          0.1        7414.8       5.7X

OpenJDK 64-Bit Server VM 21.0.6+7-LTS on Linux 6.8.0-1020-azure
AMD EPYC 7763 64-Core Processor
Interval:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Read as Intervals                                   808            809           1          0.4        2693.2       1.0X
Read Raw Strings                                    325            330           6          0.9        1082.6       2.5X


