Object
RangesIO is a basic class for wrapping another IO object allowing you to arbitrarily reorder slices of the input file by providing a list of ranges. Intended as an initial measure to curb inefficiencies in the Dirent#data method just reading all of a file's data in one hit, with no method to stream it.
This class will encapuslate the ranges (corresponding to big or small blocks) of any ole file and thus allow reading/writing directly to the source bytes, in a streamed fashion (so just getting 16 bytes doesn't read the whole thing).
In the simplest case it can be used with a single range to provide a limited io to a section of a file.
No buffering. by design at the moment. Intended for large reads
On further reflection, this class is something of a joining/optimization of two separate IO classes. a SubfileIO, for providing access to a range within a File as a separate IO object, and a ConcatIO, allowing the presentation of a bunch of io objects as a single unified whole.
I will need such a ConcatIO if I'm to provide Mime#to_io, a method that will convert a whole mime message into an IO stream, that can be read from. It will just be the concatenation of a series of IO objects, corresponding to headers and boundaries, as StringIO's, and SubfileIO objects, coming from the original message proper, or RangesIO as provided by the Attachment#data, that will then get wrapped by Mime in a Base64IO or similar, to get encoded on-the- fly. Thus the attachment, in its plain or encoded form, and the message as a whole never exists as a single string in memory, as it does now. This is a fair bit of work to achieve, but generally useful I believe.
This class isn't ole specific, maybe move it to my general ruby stream project.
io |
the parent io object that we are wrapping. |
mode |
the mode to use |
params |
hash of params. |
:ranges - byte offsets, either:
an array of ranges [1..2, 4..5, 6..8] or
an array of arrays, where the second is length [[1, 1], [4, 1], [6, 2]] for the above (think the way String indexing works)
:close_parent - boolean to close parent when this object is closed
NOTE: the ranges can overlap.
# File lib/ole/ranges_io.rb, line 55 def initialize io, mode='r', params={} mode, params = 'r', mode if Hash === mode ranges = params[:ranges] @params = {:close_parent => false}.merge params @mode = IO::Mode.new mode @io = io # initial position in the file @pos = 0 self.ranges = ranges || [[0, io.size]] # handle some mode flags truncate 0 if @mode.truncate? seek size if @mode.append? end
# File lib/ole/ranges_io.rb, line 148 def close @io.close if @params[:close_parent] end
i can wrap it in a buffered io stream that provides gets, and appropriately handle pos, truncate. mostly added just to past the tests. FIXME
# File lib/ole/ranges_io.rb, line 236 def gets s = read 1024 i = s.index "\n" self.pos -= s.length - (i+1) s[0..i] end
# File lib/ole/ranges_io.rb, line 244 def inspect "#<#{self.class} io=#{io.inspect}, size=#{@size}, pos=#{@pos}>" end
# File lib/ole/ranges_io.rb, line 111 def pos= pos, whence=IO::SEEK_SET case whence when IO::SEEK_SET when IO::SEEK_CUR pos += @pos when IO::SEEK_END pos = @size + pos else raise Errno::EINVAL end raise Errno::EINVAL unless (0..@size) === pos @pos = pos # do a binary search throuh @offsets to find the active range. a, c, b = 0, 0, @offsets.length while a < b c = (a + b).div(2) pivot = @offsets[c] if pos == pivot @active = c return elsif pos < pivot b = c else a = c + 1 end end @active = a - 1 end
# File lib/ole/ranges_io.rb, line 81 def ranges= ranges # convert ranges to arrays. check for negative ranges? ranges = ranges.map { |r| Range === r ? [r.begin, r.end - r.begin] : r } # combine ranges if @params[:combine] == false # might be useful for debugging... @ranges = ranges else @ranges = [] next_pos = nil ranges.each do |pos, len| if next_pos == pos @ranges.last[1] += len next_pos += len else @ranges << [pos, len] next_pos = pos + len end end end # calculate cumulative offsets from range sizes @size = 0 @offsets = [] @ranges.each do |pos, len| @offsets << @size @size += len end self.pos = @pos end
read bytes from file, to a maximum of limit, or all available if unspecified.
# File lib/ole/ranges_io.rb, line 157 def read limit=nil data = '' return data if eof? limit ||= size pos, len = @ranges[@active] diff = @pos - @offsets[@active] pos += diff len -= diff loop do @io.seek pos if limit < len s = @io.read(limit).to_s @pos += s.length data << s break end s = @io.read(len).to_s @pos += s.length data << s break if s.length != len limit -= len break if @active == @ranges.length - 1 @active += 1 pos, len = @ranges[@active] end data end
using explicit forward instead of an alias now for overriding. should override truncate.
# File lib/ole/ranges_io.rb, line 192 def size= size truncate size end
you may override this call to update @ranges and @size, if applicable.
# File lib/ole/ranges_io.rb, line 186 def truncate size raise NotImplementedError, 'truncate not supported' end
# File lib/ole/ranges_io.rb, line 196 def write data return 0 if data.empty? data_pos = 0 # if we don't have room, we can use the truncate hook to make more space. if data.length > @size - @pos begin truncate @pos + data.length rescue NotImplementedError raise IOError, "unable to grow #{inspect} to write #{data.length} bytes" end end pos, len = @ranges[@active] diff = @pos - @offsets[@active] pos += diff len -= diff loop do @io.seek pos if data_pos + len > data.length chunk = data[data_pos..-1] @io.write chunk @pos += chunk.length data_pos = data.length break end @io.write data[data_pos, len] @pos += len data_pos += len break if @active == @ranges.length - 1 @active += 1 pos, len = @ranges[@active] end data_pos end
Generated with the Darkfish Rdoc Generator 2.