1. Fast data writing with fwrite()
In this final lesson of the course, you will look at data table's high-performance parallel file writer, fwrite().
2. fwrite
Similar to fread(), fwrite() is a fast and parallel file WRITER. By default, it uses all available threads to write to file. You can control the number of threads to use using the "nThread" argument.
fwrite() provides intelligent defaults so that in most cases, only the data and file name arguments are required. It also has the ability to write columns of type "list" by flattening the list column with a secondary separator which is the symbol for the OR operator ("|") by default.
3. date and datetime columns (ISO)
fwrite() also provides multiple ways to write date and datetime columns using the argument "dateTimeAs" which defaults to the ISO format. This results in representing datetime values in international standard thereby avoiding ambiguity while writing to or reading back from file.
4. Date and times
Let's use this data table dt to look at these formats. Note that each column is a different type, date, time, and datetime. The as dot IDate() and as dot ITime() functions from data table extract the relevant portions from the timestamp.
5. date and datetime columns (ISO)
Have a look at the "datetime" column that is being read back using fread(). This is exactly the way fwrite() wrote the column to file. Also note that writing in ISO format not only avoids ambiguity but is also extremely fast to write to file.
6. date and datetime columns (Squash)
You can also set the dateTimeAs argument to "squash" which, as the name suggests, squashes the values together. This results in removing the hyphen (-) and colon (:) separators. Thus, the columns are read back by default as integers.
7. date and datetime columns (Squash)
This is particularly useful for columns whose primary purpose is to allow for extraction of year, month, date, hour, minute, seconds etc. which is quite useful in grouping operations.
For example, using integer division as shown here, you can extract the year very efficiently.
8. date and datetime columns (Epoch)
You can also set the dateTimeAs argument to "epoch" which counts the number of days or seconds since the relevant epoch which is Jan 1, 1970, midnight, and Jan 1, 1970 midnight for date, time and datetime, respectively. dates, times, datetimes lesser than these respective epochs would have negative values.
The options "iso", "squash" and "epoch" are all extremely fast due to specialized C code and are extensively tested for correctness and also allow for unambiguous and fast reading of those columns.
9. date and datetime columns (Epoch)
As you can see here, when you write to a file by setting dateTimeAs to "epoch" and read it back the columns are integers denoting the number of days and seconds from the respective epoch.
10. Let's practice!
Go ahead and use fwrite() to write data to files.