Summary | error :source-ing files with "variable length" record format |
Queue | Vim on VMS |
Type | Bug |
State | Accepted |
Priority | 1. Low |
Owners | |
Requester | sferencik (at) alpha (dot) polarhome (dot) com |
Created | 12/28/2010 (5287 days ago) |
Due | |
Updated | 03/29/2011 (5196 days ago) |
Assigned | |
Resolved | |
Attachments |
= ONCE MORE (more readable indent) =
= ignore Comment #2 =
====================================
Motivation for this ticket.
The problem with sourcing appears whenever a vim plugin is copied to
VMS via FTP. If it is dropped into [plugin] or [ftplugin] or
[autoload] it then crashes when it is sourced.
--------------------
Details
Here are some notes I have compiled on copying files from Windows to
VMS via FTP.
Binary FTP
First of all, binary-FTP-copying is of out of question. The file
created on
the VMS is padded to the next higher multiple of 512 bytes, which is very
ugly. This also shows when the file is opened in vim. (Other
than that, the
file thus copied is nicely readable on the VMS by vim, and sourcing it
works -- if its fileformat on Windows was "unix".) The rest of this
therefore discusses ASCII FTP.
ASCII FTP
A file on the local (Windows) machine either has the DOS or UNIX
fileformat. In the former case, the lines are separated by 0d0a, in the
latter case only by 0a. Obviously, the former is native, the
other is alien
to Windows.
When the file is ASCII-FTP copied to VMS, it is converted to an RMS file
with variable-length record format and carriage-return-carriage-control
record attributes. The FTP knows it is copying a file from
Windows to VMS,
and thus presumes that the original (Windows) file uses 0d0a sequences as
line separators. (Which assumption is wrong if the fileformat is Unix.)
Whatever the fileformat was on Windows, the resultant file has now been
converted to an RMS file, whose each line (record) has an attribute
specifying its length. The record attributes are "Carriage
return carriage
control". This would suggest there are still some "carriage returns"
between records (xxd reports line feeds, "0a"s), but this may just be a
trick of OpenVMS:
Wikipedia: "OpenVMS uses a record-based file system, which
stores text
files as one record per line. In most file formats, no line
terminators
are actually stored, but the Record Management Services facility can
transparently add a terminator to each line when it is
retrieved by an
application."
This is where the original fileformat makes a difference: a
multi-line file
with fileformat=dos has been converted to a multi-record RMS file, with
records starting wherever 0d0a sequences were found. A
multi-line file with
fileformat=unix was also converted, but since there were no 0d0a
sequences,
it only contains a single record on VMS. xxd sees them
identically: for the
multi-record "dos" file it displays "0a"s where records break; for the
single-record "unix" file it displays "0a"s which have not been
recognised
as line terminators and are "still present" in the single record.
Example: let's suppose a Windows file contains the text "a\nb".
The "dos"
representation would be 61-0d-0a-62, and this would be converted to a
2-record file: the first record would have length 1 (the byte 61), the
second record would also have length 1 (the byte 62), and xxd would show
61-0a-62. The "unix" representation would be 61-0a-62, and this would be
converted to a single-record file: the record would have length 3
(61-0a-62), and xxd would also show 61-0a-62.
Now we use vim to open the file. The "dos" file is read as a multi-line
file, with "0a"s used as separators. Good. The "unix" file is read as a
one-liner. That's also fine with vim, which notices the "0a"s
and considers
them line separators. Also good.
In both cases vim detected the fileformat as "unix" (since there were no
0d0a sequences). If we were to save either of the files using
the VMS vim,
it would become an RMS file with Stream_LF record format (and the same
carriage-return-carriage-control record attributes).
Paradoxically, there is a problem when sourcing the multi-record,
originally-dos-fileformat files -- while sourcing the single-record,
originally-unix-fileformat file succeeds. Perhaps this is a vim bug.
When sourcing the file, vim apparently does not recognise the
record limits
and simply concatenates the records without putting the 0a bytes in
between. Thus, the whole text becomes a mess (the lines are not separated
at all).
On the contrary, the originally-unix-fileformat file works because after
vim reads it as a one-liner, it detects the "0a"s, separates the
lines, and
sources it correctly.
Conclusion
The above would suggest it is best to use fileformat=unix when
FTP copying.
However, there is a serious problem with this: if the file is longer than
32767 bytes, the file is truncated because at most 32768 bytes can be put
into a single RMS-file record. (And we are creating a
single-file record,
as described above.) What's worse, no warning/error is issued when this
happens during the ASCII-FTP copy.
Thus, it is best and safest to use the dos fileformat, ASCII FTP
copy, and
then change the file's record format to Stream_LF using convert/fdl.
The problem with sourcing appears whenever a vim plugin is copied to
VMS via FTP. If it is dropped into [plugin] or [ftplugin] or
[autoload] it then crashes when it is sourced.
--------------------
Details
Here are some notes I have compiled on copying files from Windows to
VMS via FTP.
Binary FTP
First of all, binary-FTP-copying is of out of question. The file
created on the VMS is padded to the next higher multiple of 512
bytes, which is very ugly. This also shows when the file is
opened in vim. (Other than that, the file thus copied is nicely
readable on the VMS by vim, and sourcing it works -- if its
fileformat on Windows was "unix".) The rest of this therefore
discusses ASCII FTP.
ASCII FTP
A file on the local (Windows) machine either has the DOS or UNIX
fileformat. In the former case, the lines are separated by
0d0a, in the latter case only by 0a. Obviously, the former is
native, the other is alien to Windows.
When the file is ASCII-FTP copied to VMS, it is converted to an
RMS file with variable-length record format and
carriage-return-carriage-control record attributes. The FTP knows
it is copying a file from Windows to VMS, and thus presumes
that the original (Windows) file uses 0d0a sequences as line
separators. (Which assumption is wrong if the fileformat is Unix.)
Whatever the fileformat was on Windows, the resultant file has
now been converted to an RMS file, whose each line (record) has
an attribute specifying its length. The record attributes are
"Carriage return carriage control". This would suggest there are
still some "carriage returns" between records (xxd reports line
feeds, "0a"s), but this may just be a trick of OpenVMS:
Wikipedia: "OpenVMS uses a record-based file system, which
stores text files as one record per line. In most file formats,
no line terminators are actually stored, but the Record
Management Services facility can transparently add a terminator to
each line when it is retrieved by an application."
This is where the original fileformat makes a difference: a
multi-line file with fileformat=dos has been converted to a
multi-record RMS file, with records starting wherever 0d0a
sequences were found. A multi-line file with fileformat=unix was also
converted, but since there were no 0d0a sequences, it only
contains a single record on VMS. xxd sees them identically: for the
multi-record "dos" file it displays "0a"s where records break;
for the single-record "unix" file it displays "0a"s which have
not been recognised as line terminators and are "still present"
in the single record.
Example: let's suppose a Windows file contains the text "a\nb".
The "dos" representation would be 61-0d-0a-62, and this would be
converted to a 2-record file: the first record would have length
1 (the byte 61), the second record would also have length 1
(the byte 62), and xxd would show 61-0a-62. The "unix"
representation would be 61-0a-62, and this would be converted to a
single-record file: the record would have length 3 (61-0a-62),
and xxd would also show 61-0a-62.
Now we use vim to open the file. The "dos" file is read as a
multi-line file, with "0a"s used as separators. Good. The "unix"
file is read as a one-liner. That's also fine with vim, which
notices the "0a"s and considers them line separators. Also good.
In both cases vim detected the fileformat as "unix" (since there
were no 0d0a sequences). If we were to save either of the files
using the VMS vim, it would become an RMS file with Stream_LF
record format (and the same carriage-return-carriage-control
record attributes).
Paradoxically, there is a problem when sourcing the multi-record,
originally-dos-fileformat files -- while sourcing the
single-record, originally-unix-fileformat file succeeds. Perhaps
this is a vim bug.
When sourcing the file, vim apparently does not recognise the
record limits and simply concatenates the records without putting
the 0a bytes in between. Thus, the whole text becomes a mess (the
lines are not separated at all).
On the contrary, the originally-unix-fileformat file works
because after vim reads it as a one-liner, it detects the "0a"s,
separates the lines, and sources it correctly.
Conclusion
The above would suggest it is best to use fileformat=unix when
FTP copying. However, there is a serious problem with this: if
the file is longer than 32767 bytes, the file is truncated
because at most 32768 bytes can be put into a single RMS-file record.
(And we are creating a single-file record, as described above.)
What's worse, no warning/error is issued when this happens
during the ASCII-FTP copy.
Thus, it is best and safest to use the dos fileformat, ASCII FTP
copy, and then change the file's record format to Stream_LF
using convert/fdl.
State ⇒ Unconfirmed
New Attachment: A.ZIP
Queue ⇒ Vim on VMS
Summary ⇒ error :source-ing files with "variable length" record format
Type ⇒ Bug
Priority ⇒ 1. Low
vim with no
problems, but :source-ing them fails since there are no newlines.
Presumably, "variable length" is one of those RMS formats where
newlines are not
stored in the file. Instead, RMS knows the lengths of the individual
"rows"/records.
OpenVMS uses a record-based file system, which stores text files as one
record per line. In most file formats, no line terminators are actually
stored, but the Record Management Services facility can
transparently add a
terminator to each line when it is retrieved by an application.
-- http://en.wikipedia.org/wiki/Crlf
It would seem that when a file is edited, the newlines are added, but
when it is
sourced, they are not.
Please consider the file in the attachment (a.vim). It can be edited (vim
a.vim), but sourcing it (:so a.vim) fails.