nectec.semantic.web.knowledge.application.framework.common
Class UnicodeInputStream

java.lang.Object
  extended by java.io.InputStream
      extended by nectec.semantic.web.knowledge.application.framework.common.UnicodeInputStream
All Implemented Interfaces:
Closeable

public class UnicodeInputStream
extends InputStream

Modified from velocity 1.5 org.apache.velocity.io.UnicodeInputStream This is an input stream that is unicode BOM aware. This allows you to e.g. read Windows Notepad Unicode files as Velocity templates. It allows you to check the actual encoding of a file by calling getEncodingFromStream() on the input stream reader. This class is not thread safe! When more than one thread wants to use an instance of UnicodeInputStream, the caller must provide synchronization. BOMs: 00 00 FE FF = UTF-32, big-endian FF FE 00 00 = UTF-32, little-endian FE FF = UTF-16, big-endian FF FE = UTF-16, little-endian EF BB BF = UTF-8 Win2k Notepad: Unicode format = UTF-16LE

See Also:
Unicode BOM FAQ, JDK Bug 4508058

Nested Class Summary
static class UnicodeInputStream.UnicodeBOM
          Helper class to bundle encoding and BOM marker.
 
Field Summary
static UnicodeInputStream.UnicodeBOM UTF16BE_BOM
          BOM Marker for UTF 16, big endian.
static UnicodeInputStream.UnicodeBOM UTF16LE_BOM
          BOM Marker for UTF 16, little endian.
static UnicodeInputStream.UnicodeBOM UTF32BE_BOM
          BOM Marker for UTF 32, big endian.
static UnicodeInputStream.UnicodeBOM UTF32LE_BOM
          BOM Marker for UTF 32, little endian.
static UnicodeInputStream.UnicodeBOM UTF8_BOM
          BOM Marker for UTF 8.
 
Constructor Summary
UnicodeInputStream(InputStream inputStream)
          Creates a new UnicodeInputStream object.
UnicodeInputStream(InputStream inputStream, boolean skipBOM)
          Creates a new UnicodeInputStream object.
 
Method Summary
 int available()
           
 void close()
           
 String getEncodingFromStream()
          Read encoding based on BOM.
 boolean isSkipBOM()
          Returns true if the input stream discards the BOM.
 void mark(int readlimit)
           
 boolean markSupported()
           
 int read()
           
 int read(byte[] b)
           
 int read(byte[] b, int off, int len)
           
 void reset()
           
 long skip(long n)
           
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

UTF8_BOM

public static final UnicodeInputStream.UnicodeBOM UTF8_BOM
BOM Marker for UTF 8. See http://www.unicode.org/unicode/faq/utf_bom.html


UTF16LE_BOM

public static final UnicodeInputStream.UnicodeBOM UTF16LE_BOM
BOM Marker for UTF 16, little endian. See http://www.unicode.org/unicode/faq/utf_bom.html


UTF16BE_BOM

public static final UnicodeInputStream.UnicodeBOM UTF16BE_BOM
BOM Marker for UTF 16, big endian. See http://www.unicode.org/unicode/faq/utf_bom.html


UTF32LE_BOM

public static final UnicodeInputStream.UnicodeBOM UTF32LE_BOM
BOM Marker for UTF 32, little endian. See http://www.unicode.org/unicode/faq/utf_bom.html TODO: Does Java actually support this?


UTF32BE_BOM

public static final UnicodeInputStream.UnicodeBOM UTF32BE_BOM
BOM Marker for UTF 32, big endian. See http://www.unicode.org/unicode/faq/utf_bom.html TODO: Does Java actually support this?

Constructor Detail

UnicodeInputStream

public UnicodeInputStream(InputStream inputStream)
                   throws IllegalStateException,
                          IOException
Creates a new UnicodeInputStream object. Skips a BOM which defines the file encoding.

Parameters:
inputStream - The input stream to use for reading.
Throws:
IllegalStateException
IOException

UnicodeInputStream

public UnicodeInputStream(InputStream inputStream,
                          boolean skipBOM)
                   throws IllegalStateException,
                          IOException
Creates a new UnicodeInputStream object.

Parameters:
inputStream - The input stream to use for reading.
skipBOM - If this is set to true, a BOM read from the stream is discarded. This parameter should normally be true.
Throws:
IllegalStateException
IOException
Method Detail

isSkipBOM

public boolean isSkipBOM()
Returns true if the input stream discards the BOM.

Returns:
True if the input stream discards the BOM.

getEncodingFromStream

public String getEncodingFromStream()
Read encoding based on BOM.

Returns:
The encoding based on the BOM.
Throws:
IllegalStateException - When a problem reading the BOM occured.

close

public void close()
           throws IOException
Specified by:
close in interface Closeable
Overrides:
close in class InputStream
Throws:
IOException
See Also:
InputStream.close()

available

public int available()
              throws IOException
Overrides:
available in class InputStream
Throws:
IOException
See Also:
InputStream.available()

mark

public void mark(int readlimit)
Overrides:
mark in class InputStream
See Also:
InputStream.mark(int)

markSupported

public boolean markSupported()
Overrides:
markSupported in class InputStream
See Also:
InputStream.markSupported()

read

public int read()
         throws IOException
Specified by:
read in class InputStream
Throws:
IOException
See Also:
InputStream.read()

read

public int read(byte[] b)
         throws IOException
Overrides:
read in class InputStream
Throws:
IOException
See Also:
InputStream.read(byte[])

read

public int read(byte[] b,
                int off,
                int len)
         throws IOException
Overrides:
read in class InputStream
Throws:
IOException
See Also:
InputStream.read(byte[], int, int)

reset

public void reset()
           throws IOException
Overrides:
reset in class InputStream
Throws:
IOException
See Also:
InputStream.reset()

skip

public long skip(long n)
          throws IOException
Overrides:
skip in class InputStream
Throws:
IOException
See Also:
InputStream.skip(long)