Patrick's CMS

Unix and Windows lines

Apache / Linux websites are Unix. Websites on IIS servers are not Unix but Windows.

A line break (newline) in Unix is "\n" (a newline) but in Windows is "\r\n" (a carriage return plus a newline). It means files transferred between the two may have different line endings.

For instance, when a two-paragraph text file on a Unix website is download to a Windows PC, opened, then pasted back in a textarea on a Unix website, the original single empty line separating the paragraphs becomes three. When the file was opened in Windows a carriage return ("\n" to "\r\n") was automatically added to:

(1) the line ending at the end of the upper paragraph, and

(2) the line ending at the end of the empty line

... which makes one empty line into three. This happens only if the file is actually opened in Windows. If the file is opened with Notepad the extra lines are already apparent. They may not be in all text editors but they are when pasted back in Unix.

None of this happens when the original file was created on an IIS (Windows) website, or even in Notepad in a Windows environment. It can be pasted into Unix with no effect. The problem seems to occur only when the original file was created in Unix with "\n" line endings then is opened in Windows. Repeat: no problem creating a file in Windows then pasting it into Unix.

The autop function standardises line endings as "\n" on all platforms.


Files can be transferred between an FTP client and server in ASCII or binary. The purpose of ASCII type is to ensure that line endings are properly changed to what is right on the platform. According to the FTP specification, ASCII files are always transferred using a CR+LF pair as line ending (i.e. Windows). When a file is transferred from the client to the server, the client has to make sure CR+LF is used. Therefore it has to add nothing (on Microsoft Windows), add CR (on Unix). The server then adjusts the line ending again to what is used on the platform the server runs at. If it is Microsoft Windows, nothing has to be removed, while on Unix the superfluous CR is removed.

The same happens when a file is downloaded from the server to the client. The server makes sure the line endings are CR+LF when sending the file and the client then strips away whatever is not needed as line ending on its platform. Because the file is changed if client and server are not running on the same kind of platform, this data type cannot be used for files with arbitrary characters, so-called binary files like images and videos. If it is used anyway, the binary files most likely are corrupted and won't work as expected any more.

Compared to ASCII type, binary type is the easier one: the file is just transferred as-is, and no line ending translation is done. So when you are not sure what to use, always go for binary type.

CR+LF = \r\n
LF = \n

Note: I am not an expert in such matters.


See FileZilla data types »

Information

Page last modified: 30 November, 2022