public class WriterImpl extends Object implements WriterInternal, MemoryManager.Callback
This class is unsynchronized like most Stream objects, so from the creation of an OrcFile and all access to a single instance has to be from a single thread.
There are no known cases where these happen between different threads today.
Caveat: the MemoryManager is created during WriterOptions create, that has to be confined to a single thread as well.
Constructor and Description |
---|
WriterImpl(FileSystem fs,
Path path,
OrcFile.WriterOptions opts) |
Modifier and Type | Method and Description |
---|---|
void |
addRowBatch(VectorizedRowBatch batch)
Add a row batch to the ORC file.
|
void |
addUserMetadata(String name,
ByteBuffer value)
Add arbitrary meta-data to the ORC file.
|
void |
appendStripe(byte[] stripe,
int offset,
int length,
StripeInformation stripeInfo,
OrcProto.StripeStatistics stripeStatistics)
Fast stripe append to ORC file.
|
void |
appendStripe(byte[] stripe,
int offset,
int length,
StripeInformation stripeInfo,
StripeStatistics[] stripeStatistics)
Fast stripe append to ORC file.
|
void |
appendUserMetadata(List<OrcProto.UserMetadataItem> userMetadata)
Update the current user metadata with a list of new values.
|
boolean |
checkMemory(double newScale)
The scale factor for the stripe size has changed and thus the
writer should adjust their desired size appropriately.
|
void |
close()
Flush all of the buffers and close the file.
|
static CompressionCodec |
createCodec(CompressionKind kind) |
long |
estimateMemory()
Estimate the memory currently used by the writer to buffer the stripe.
|
CompressionCodec |
getCompressionCodec() |
static int |
getEstimatedBufferSize(long stripeSize,
int numColumns,
int bs) |
long |
getNumberOfRows()
Row count gets updated when flushing the stripes.
|
long |
getRawDataSize()
Raw data size will be compute when writing the file footer.
|
TypeDescription |
getSchema()
Get the schema for this writer
|
ColumnStatistics[] |
getStatistics()
Get the statistics about the columns in the file.
|
List<StripeInformation> |
getStripes()
Get the stripe information about the file.
|
void |
increaseCompressionSize(int newSize)
Increase the buffer size for this writer.
|
long |
writeIntermediateFooter()
Write an intermediate footer on the file such that if the file is
truncated to the returned offset, it would be a valid ORC file.
|
public WriterImpl(FileSystem fs, Path path, OrcFile.WriterOptions opts) throws IOException
IOException
public static int getEstimatedBufferSize(long stripeSize, int numColumns, int bs)
public void increaseCompressionSize(int newSize)
WriterInternal
increaseCompressionSize
in interface WriterInternal
newSize
- the new buffer size.public static CompressionCodec createCodec(CompressionKind kind)
public boolean checkMemory(double newScale) throws IOException
MemoryManager.Callback
checkMemory
in interface MemoryManager.Callback
newScale
- the current scale factor for memory allocationsIOException
public TypeDescription getSchema()
Writer
public void addUserMetadata(String name, ByteBuffer value)
Writer
addUserMetadata
in interface Writer
name
- a key to label the data with.value
- the contents of the metadata.public void addRowBatch(VectorizedRowBatch batch) throws IOException
Writer
addRowBatch
in interface Writer
batch
- the rows to addIOException
public void close() throws IOException
Writer
close
in interface Closeable
close
in interface AutoCloseable
close
in interface Writer
IOException
public long getRawDataSize()
getRawDataSize
in interface Writer
public long getNumberOfRows()
getNumberOfRows
in interface Writer
public long writeIntermediateFooter() throws IOException
Writer
writeIntermediateFooter
in interface Writer
IOException
public void appendStripe(byte[] stripe, int offset, int length, StripeInformation stripeInfo, OrcProto.StripeStatistics stripeStatistics) throws IOException
Writer
Writer.appendStripe(byte[], int, int, StripeInformation, StripeStatistics[])
for files with encryption.appendStripe
in interface Writer
stripe
- - stripe as byte arrayoffset
- - offset within byte arraylength
- - length of stripe within byte arraystripeInfo
- - stripe informationstripeStatistics
- - unencrypted stripe statisticsIOException
public void appendStripe(byte[] stripe, int offset, int length, StripeInformation stripeInfo, StripeStatistics[] stripeStatistics) throws IOException
Writer
Writer.addUserMetadata(String,
ByteBuffer)
to append any user metadata.appendStripe
in interface Writer
stripe
- - stripe as byte arrayoffset
- - offset within byte arraylength
- - length of stripe within byte arraystripeInfo
- - stripe informationstripeStatistics
- - stripe statistics with the last one being
for the unencrypted data and the others being for
each encryption variant.IOException
public void appendUserMetadata(List<OrcProto.UserMetadataItem> userMetadata)
Writer
appendUserMetadata
in interface Writer
userMetadata
- - user metadatapublic ColumnStatistics[] getStatistics()
Writer
getStatistics
in interface Writer
public List<StripeInformation> getStripes() throws IOException
Writer
getStripes
in interface Writer
IOException
public CompressionCodec getCompressionCodec()
public long estimateMemory()
Writer
estimateMemory
in interface Writer
Copyright © 2013–2023 The Apache Software Foundation. All rights reserved.