Library for I/O

This chapter forms an integral part of "The Input/Output Module" - should "The Input/Output Module" be implemneted, this chapter along with any chapter constituting part of "The Input/Output Module" must be implemented in their entirity.

For the purpose of this chapter, the following definitions from the POSIX standard apply:

Additionally, a file handle is anything that can be used to operate on files. One file may have several file handles. This chapter define several types of object that're file handles.

When a file is operated on from separate handles, the behavior is undefined.

Note: For example, in C, when standard input is being read through a FILE * handle, and buffering is enabled, the subsequent file position of the file descriptor (if implemented on top of one) is undefined - this can cause issue when one program subsequently loads another (e.g. using one of the exec functions) and the loaded program proceeds from an unexpected file position.

Note: This is among the few undefined behaviors in cxing, by choosing to not define its behavior, explicit permission is given to implementations to buffer I/O data.

When a directory entry is created as a result of calling one of the functions that accesses the filesystem, barring security hardening by specific implementations of this module, eventhough not a recommended practice, the called function should not place access restriction beyond what's already placed by system defaults.

Note: As an example of what previous paragraph means, function calls such as mkdir, mkfifo, open, etc. should use the most liberal permission on the created file - i.e. 0o777 for directories and 0o666 non-executable files according to POSIX, with 'file mode creation mask' (i.e. umask) clearing excess permissions as the said 'system default'. The previous paragraph is normative to the extent not to forbid current latest evolving security best practice.

Simple Input/Output

subr input();
subr print(s);

The input() function is a subroutine that reads a line from the standard input, stripping a single trailing line-feed \n byte, then if there is one, a trailing carriage-return \r byte, then returns the resulting string. On EOF a blessed null that uncasts to 0 is returned; on error, a blessed null that uncasts to an implementation-defined status code is returned.

The print() function is a subroutine that writes the string argument s to the standard output, followed by a single line-feed \n byte. On success, the number of bytes successfully written. A blessed null that uncasts to an implementation-defined status code is returned on failure.

Generic File

GenericFile(obj) := {
  method read(len),
  method getdelim(c),
  method getline(),
  method write(s),
  method __copy__(),
  method __final__(),
  method flush(),
  method setsync(b),
}

A GenericFile is the base type for file handle objects.

Its read method reads at most len bytes of data and returns it. On EOF, it returns an empty string; on error, it returns a blessed null that uncasts to an implementation-defined status code.

Its getdelim method reads until when the delimitor byte c is encountered, and returns a string, up to including the delimitor. If c is a string, then its initial byte is taken, if c is an empty string, the nul byte is assumed, if c is an integer, then the lower 8 bits are taken as the value for the delimitor byte. Any data read that had not been returned shall be available for further reading from the same file handle object.

Its getline method is equivalent to getdelim called with \n - i.e. the U+000A LINE FEED character.

Its write method writes the string s to the file, and returns the number of bytes actually written. On error, it returns a blessed null that uncasts to an implementation-defined status code.

The closure of the file is governed by the __copy__ and the __final__ methods for resource management. Each copy of a file handle produced by the __copy__ method refers to the same underlying file. When all copies of the file handle are destroyed, the file handle is automatically closed, any buffered content will be committed, any resource used for operating the file will be released.

For any file, there may be several layers of buffering, two of which are defined here (the rest are given acknowledgement).

  1. The user-space buffering, which are committed by calling the flush method,
  2. The system buffering, which can be disabled (or enabled) by calling setsync with true (or false).

The act of "committing" make it more likely that future access to the data would succeed, such as writing data permanently to the disk. Further buffering, such as those done by routers and switches for network sockets, are out of the control of the program, and to some extent, the system.

Regular Files

subr open(path, mode);
RegularFile(GenericFile) := {
  method lseek(offset, whence),
}

The open function is a subroutine that opens a file named by the path argument, under the mode specified by the mode argument. The file to open doesn't have to be a regular file, any type of file supported by the implementation may be opened (e.g. FIFO, but not sockets).

The mode is made up of one of the following 4 major options:

and modified by any combination of the following minor options:

The lseek method adds offset to the position indicated by whence, and returns the resulting file position:

Unidirectional Communication

The types of files in this section are required to support communicating in one direction, volunteer support for bidirection communication is not required.

subr mkfifo(path);
subr pipe();

The mkfifo function creates a FIFO - i.e. a pipe with a filesystem name. On success, it returns path; on failure, it returns a blessed null that uncasts to an implementation-defined status code.

The pipe function creates an anonymous pipe, and returns an object with 2 members:

Both of which are file handles. On failure, it returns a blessed null that uncasts to an implementation-defined status code.

Issue: default handling of SIGPIPE.

Filesystem Operations

subr rename(old, new);
subr remove(path);

The function rename renames the old directory entry to the new name. On success, new is returned, otherwise, a blessed null that uncasts to an implementation-defined status code is returned.

The function remove causes the directory entry path to be no longer accessible. On success, it returns 0, otherwise, a blessed null that uncasts to an implementation-defined status code is returned.

subr mkdir(path);
[subr opendir(path)] := {
  method readdir(),
  method rewinddir(),
  method closedir(),
}

The mkdir function creates a directory reachable at path. On success, path is returned, otherwise, a blessed null that uncasts to an implementation-defined status code is returned.

The opendir function opens a directory to enumerate its entries. On success, a directory handle is returned, otherwise, a blessed null that uncasts to an implementation-defined status code is returned.

The readdir method returns a string naming the directory entry at the current directory position, and advancing it. The directory position of a directory handle is an opaque internal concept of directory handle. The rewinddir resets the directory position to the state it was when it was opened and before any call to readdir were made.

The closedir function release any resource used by the directory handle. Any further use of the directory handle are invalid and results in error in an undefined way.

Error Numbers

There is numerous reference to 'implementation-defined status code' that're returned (in the form of blessed nulls) on failures. On POSIX and some other platforms, this is typically one of the errno codes.

Whenever this module is supported, all error number codes - those errno constants defined in the relevant header of POSIX-2017 shall be defined and exposed as integer constants to the cxing program. Implementations may expose additional errno constants beyond those specified in POSIX-2017. Future versions of this module may require definitions of errno constants from a newer edition of the POSIX standard.