[#25] error :source-ing files with "variable length" record format
Summary error :source-ing files with "variable length" record format
Queue Vim on VMS
Type Bug
State Accepted
Priority 1. Low
Owners
Requester sferencik (at) alpha (dot) polarhome (dot) com
Created 12/28/2010 (5287 days ago)
Due
Updated 03/29/2011 (5196 days ago)
Assigned
Resolved
Attachments

History
03/29/2011 Zoltan Arpadffy State ⇒ Accepted
 
12/28/2010 sferencik@alpha.polarhome.com Comment #3 Reply to this comment
                      ====================================
                      = ONCE MORE (more readable indent) =
                      = ignore Comment #2                =
                      ====================================

Motivation for this ticket.

The problem with sourcing appears whenever a vim plugin is copied to
VMS via FTP. If it is dropped into [plugin] or [ftplugin] or
[autoload] it then crashes when it is sourced.

--------------------
Details

Here are some notes I have compiled on copying files from Windows to
VMS via FTP.

Binary FTP
      First of all, binary-FTP-copying is of out of question. The file 
created on
      the VMS is padded to the next higher multiple of 512 bytes, which is very
      ugly. This also shows when the file is opened in vim. (Other 
than that, the
      file thus copied is nicely readable on the VMS by vim, and sourcing it
      works -- if its fileformat on Windows was "unix".) The rest of this
      therefore discusses ASCII FTP.

ASCII FTP
      A file on the local (Windows) machine either has the DOS or UNIX
      fileformat. In the former case, the lines are separated by 0d0a, in the
      latter case only by 0a. Obviously, the former is native, the 
other is alien
      to Windows.

      When the file is ASCII-FTP copied to VMS, it is converted to an RMS file
      with variable-length record format and carriage-return-carriage-control
      record attributes. The FTP knows it is copying a file from 
Windows to VMS,
      and thus presumes that the original (Windows) file uses 0d0a sequences as
      line separators. (Which assumption is wrong if the fileformat is Unix.)

      Whatever the fileformat was on Windows, the resultant file has now been
      converted to an RMS file, whose each line (record) has an attribute
      specifying its length. The record attributes are "Carriage 
return carriage
      control". This would suggest there are still some "carriage returns"
      between records (xxd reports line feeds, "0a"s), but this may just be a
      trick of OpenVMS:

          Wikipedia: "OpenVMS uses a record-based file system, which 
stores text
          files as one record per line. In most file formats, no line 
terminators
          are actually stored, but the Record Management Services facility can
          transparently add a terminator to each line when it is 
retrieved by an
          application."

      This is where the original fileformat makes a difference: a 
multi-line file
      with fileformat=dos has been converted to a multi-record RMS file, with
      records starting wherever 0d0a sequences were found. A 
multi-line file with
      fileformat=unix was also converted, but since there were no 0d0a 
sequences,
      it only contains a single record on VMS. xxd sees them 
identically: for the
      multi-record "dos" file it displays "0a"s where records break; for the
      single-record "unix" file it displays "0a"s which have not been 
recognised
      as line terminators and are "still present" in the single record.

      Example: let's suppose a Windows file contains the text "a\nb".   
The "dos"
      representation would be 61-0d-0a-62, and this would be converted to a
      2-record file: the first record would have length 1 (the byte 61), the
      second record would also have length 1 (the byte 62), and xxd would show
      61-0a-62. The "unix" representation would be 61-0a-62, and this would be
      converted to a single-record file: the record would have length 3
      (61-0a-62), and xxd would also show 61-0a-62.

      Now we use vim to open the file. The "dos" file is read as a multi-line
      file, with "0a"s used as separators. Good. The "unix" file is read as a
      one-liner. That's also fine with vim, which notices the "0a"s 
and considers
      them line separators. Also good.

      In both cases vim detected the fileformat as "unix" (since there were no
      0d0a sequences). If we were to save either of the files using 
the VMS vim,
      it would become an RMS file with Stream_LF record format (and the same
      carriage-return-carriage-control record attributes).

      Paradoxically, there is a problem when sourcing the multi-record,
      originally-dos-fileformat files -- while sourcing the single-record,
      originally-unix-fileformat file succeeds. Perhaps this is a vim bug.

      When sourcing the file, vim apparently does not recognise the 
record limits
      and simply concatenates the records without putting the 0a bytes in
      between. Thus, the whole text becomes a mess (the lines are not separated
      at all).

      On the contrary, the originally-unix-fileformat file works because after
      vim reads it as a one-liner, it detects the "0a"s, separates the 
lines, and
      sources it correctly.

Conclusion
      The above would suggest it is best to use fileformat=unix when 
FTP copying.
      However, there is a serious problem with this: if the file is longer than
      32767 bytes, the file is truncated because at most 32768 bytes can be put
      into a single RMS-file record.  (And we are creating a 
single-file record,
      as described above.) What's worse, no warning/error is issued when this
      happens during the ASCII-FTP copy.

      Thus, it is best and safest to use the dos fileformat, ASCII FTP 
copy, and
      then change the file's record format to Stream_LF using convert/fdl.


12/28/2010 sferencik@alpha.polarhome.com Comment #2 Reply to this comment
Motivation for this ticket.

The problem with sourcing appears whenever a vim plugin is copied to 
VMS via FTP. If it is dropped into [plugin] or [ftplugin] or
[autoload] it then crashes when it is sourced.

--------------------
Details

Here are some notes I have compiled on copying files from Windows to 
VMS via FTP.

Binary FTP
     First of all, binary-FTP-copying is of out of question. The file 
created on the VMS is padded to the next higher multiple of 512
     bytes, which is very ugly. This also shows when the file is 
opened in vim. (Other than that, the file thus copied is nicely
     readable on the VMS by vim, and sourcing it works -- if its 
fileformat on Windows was "unix".) The rest of this therefore
     discusses ASCII FTP.

ASCII FTP
     A file on the local (Windows) machine either has the DOS or UNIX 
fileformat. In the former case, the lines are separated by
     0d0a, in the latter case only by 0a. Obviously, the former is 
native, the other is alien to Windows.

     When the file is ASCII-FTP copied to VMS, it is converted to an 
RMS file with variable-length record format and
     carriage-return-carriage-control record attributes. The FTP knows 
it is copying a file from Windows to VMS, and thus presumes
     that the original (Windows) file uses 0d0a sequences as line 
separators. (Which assumption is wrong if the fileformat is Unix.)

     Whatever the fileformat was on Windows, the resultant file has 
now been converted to an RMS file, whose each line (record) has
     an attribute specifying its length. The record attributes are 
"Carriage return carriage control". This would suggest there are
     still some "carriage returns" between records (xxd reports line 
feeds, "0a"s), but this may just be a trick of OpenVMS:

         Wikipedia: "OpenVMS uses a record-based file system, which 
stores text files as one record per line. In most file formats,
         no line terminators are actually stored, but the Record 
Management Services facility can transparently add a terminator to
         each line when it is retrieved by an application."

     This is where the original fileformat makes a difference: a 
multi-line file with fileformat=dos has been converted to a
     multi-record RMS file, with records starting wherever 0d0a 
sequences were found. A multi-line file with fileformat=unix was also
     converted, but since there were no 0d0a sequences, it only 
contains a single record on VMS. xxd sees them identically: for the
     multi-record "dos" file it displays "0a"s where records break; 
for the single-record "unix" file it displays "0a"s which have
     not been recognised as line terminators and are "still present" 
in the single record.

     Example: let's suppose a Windows file contains the text "a\nb". 
The "dos" representation would be 61-0d-0a-62, and this would be
     converted to a 2-record file: the first record would have length 
1 (the byte 61), the second record would also have length 1
     (the byte 62), and xxd would show 61-0a-62. The "unix" 
representation would be 61-0a-62, and this would be converted to a
     single-record file: the record would have length 3 (61-0a-62), 
and xxd would also show 61-0a-62.

     Now we use vim to open the file. The "dos" file is read as a 
multi-line file, with "0a"s used as separators. Good. The "unix"
     file is read as a one-liner. That's also fine with vim, which 
notices the "0a"s and considers them line separators. Also good.

     In both cases vim detected the fileformat as "unix" (since there 
were no 0d0a sequences). If we were to save either of the files
     using the VMS vim, it would become an RMS file with Stream_LF 
record format (and the same carriage-return-carriage-control
     record attributes).

     Paradoxically, there is a problem when sourcing the multi-record, 
originally-dos-fileformat files -- while sourcing the
     single-record, originally-unix-fileformat file succeeds. Perhaps 
this is a vim bug.

     When sourcing the file, vim apparently does not recognise the 
record limits and simply concatenates the records without putting
     the 0a bytes in between. Thus, the whole text becomes a mess (the 
lines are not separated at all).

     On the contrary, the originally-unix-fileformat file works 
because after vim reads it as a one-liner, it detects the "0a"s,
     separates the lines, and sources it correctly.

Conclusion
     The above would suggest it is best to use fileformat=unix when 
FTP copying. However, there is a serious problem with this: if
     the file is longer than 32767 bytes, the file is truncated 
because at most 32768 bytes can be put into a single RMS-file record.
     (And we are creating a single-file record, as described above.) 
What's worse, no warning/error is issued when this happens
     during the ASCII-FTP copy.

     Thus, it is best and safest to use the dos fileformat, ASCII FTP 
copy, and then change the file's record format to Stream_LF
     using convert/fdl.



12/28/2010 sferencik@alpha.polarhome.com Comment #1
State ⇒ Unconfirmed
New Attachment: A.ZIP
Queue ⇒ Vim on VMS
Summary ⇒ error :source-ing files with "variable length" record format
Type ⇒ Bug
Priority ⇒ 1. Low
Reply to this comment
Files with record format "variable length" can be opened (edited) by 
vim with no
problems, but :source-ing them fails since there are no newlines.

Presumably, "variable length" is one of those RMS formats where 
newlines are not
stored in the file. Instead, RMS knows the lengths of the individual
"rows"/records.

     OpenVMS uses a record-based file system, which stores text files as one
     record per line. In most file formats, no line terminators are actually
     stored, but the Record Management Services facility can 
transparently add a
     terminator to each line when it is retrieved by an application.

     -- http://en.wikipedia.org/wiki/Crlf

It would seem that when a file is edited, the newlines are added, but 
when it is
sourced, they are not.

Please consider the file in the attachment (a.vim). It can be edited (vim
a.vim), but sourcing it (:so a.vim) fails.