Wednesday, July 21, 2010

What's the difference between text and binary I/O?

Q: What's the difference between text and binary I/O?

A: In text mode, a file is assumed to consist of lines of printable characters (perhaps including tabs). The routines in the stdio library (getc, putc, and all the rest) translate between the underlying system's end-of-line representation and the single \n used in C programs. C programs which simply read and write text therefore don't have to worry about the underlying system's newline conventions: when a C program writes a '\n', the stdio library writes the appropriate end-of-line indication, and when the stdio library detects an end-of-line while reading, it returns a single '\n' to the calling program. [footnote]

In binary mode, on the other hand, bytes are read and written between the program and the file without any interpretation. (On MS-DOS systems, binary mode also turns off testing for control-Z as an in-band end-of-file character.)

Text mode translations also affect the apparent size of a file as it's read. Because the characters read from and written to a file in text mode do not necessarily match the characters stored in the file exactly, the size of the file on disk may not always match the number of characters which can be read from it. Furthermore, for analogous reasons, the fseek and ftell functions do not necessarily deal in pure byte offsets from the beginning of the file. (Strictly speaking, in text mode, the offset values used by fseek and ftell should not be interpreted at all: a value returned by ftell should only be used as a later argument to fseek, and only values returned by ftell should be used as arguments to fseek.)

In binary mode, fseek and ftell do use pure byte offsets. However, some systems may have to append a number of null bytes at the end of a binary file to pad it out to a full record.

See also questions 12.37 and 19.12.

參考來源:

"Q: What's the difference between text and binary I/O?"
- Question 12.40 (在「Google 網頁註解」中檢視)

No comments: