How to read a large text file line by line using Java?


Answers

Look at this blog:

The buffer size may be specified, or the default size may be used. The default is large enough for most purposes.

// Open the file
FileInputStream fstream = new FileInputStream("textfile.txt");
BufferedReader br = new BufferedReader(new InputStreamReader(fstream));

String strLine;

//Read File Line By Line
while ((strLine = br.readLine()) != null)   {
  // Print the content on the console
  System.out.println (strLine);
}

//Close the input stream
br.close();
Question

I need to read a large text file of around 5-6 GB line by line using Java.

How can I do this quickly?




The clear way to achieve this,

For example:

If you have dataFile.txt on your current directory

import java.io.*;
import java.util.Scanner;
import java.io.FileNotFoundException;

public class readByLine
{
    public readByLine() throws FileNotFoundException
    {
        Scanner linReader = new Scanner(new File("dataFile.txt"));

        while (linReader.hasNext())
        {
            String line = linReader.nextLine();
            System.out.println(line);
        }
        linReader.close();

    }

    public static void main(String args[])  throws FileNotFoundException
    {
        new readByLine();
    }
}

The output like as below,




I usually do the reading routine straightforward:

void readResource(InputStream source) throws IOException {
    BufferedReader stream = null;
    try {
        stream = new BufferedReader(new InputStreamReader(source));
        while (true) {
            String line = stream.readLine();
            if(line == null) {
                break;
            }
            //process line
            System.out.println(line)
        }
    } finally {
        closeQuiet(stream);
    }
}

static void closeQuiet(Closeable closeable) {
    if (closeable != null) {
        try {
            closeable.close();
        } catch (IOException ignore) {
        }
    }
}



In Java 7:

String folderPath = "C:/folderOfMyFile";
Path path = Paths.get(folderPath, "myFileName.csv"); //or any text file eg.: txt, bat, etc
Charset charset = Charset.forName("UTF-8");

try (BufferedReader reader = Files.newBufferedReader(path , charset)) {
  while ((line = reader.readLine()) != null ) {
    //separate all csv fields into string array
    String[] lineVariables = line.split(","); 
  }
} catch (IOException e) {
    System.err.println(e);
}



In Java 8, there is also an alternative to using Files.lines(). If your input source isn't a file but something more abstract like a Reader or an InputStream, you can stream the lines via the BufferedReaders lines() method.

For example:

try( BufferedReader reader = new BufferedReader( ... ) ) {
  reader.lines().foreach( line -> processLine( line ) );
}

will call processLine() for each input line read by the BufferedReader.




What you can do is scan the entire text using Scanner and go through the text line by line. Of course you should import the following:

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
public static void readText throws FileNotFoundException {
    Scanner scan = new Scanner(new File("samplefilename.txt"));
    while(scan.hasNextLine()){
        String line = scan.nextLine();
        //Here you can manipulate the string the way you want
    }
}

Scanner basically scans all the text. The while loop is used to traverse through the entire text.

The .hasNextLine() function is a boolean that returns true if there are still more lines in the text. The .nextLine() function gives you an entire line as a String which you can then use the way you want. Try System.out.println(line) to print the text.

Side Note: .txt is the file type text.




For Reading file with java 8

  package com.java.java8;

    import java.nio.file.Files;
    import java.nio.file.Paths;
    import java.util.stream.Stream;

    /**
     * The Class ReadLargeFile.
     *
     * @author Ankit Sood Apr 20, 2017
     */
    public class ReadLargeFile {

        /**
         * The main method.
         *
         * @param args
         *            the arguments
         */
        public static void main(String[] args) {
        try {
            Stream<String> stream = Files.lines(Paths.get("C:\\Users\\System\\Desktop\\demoData.txt"));
            stream.forEach(System.out::println);
        } catch (Exception e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
        }
    }



You can use streams to do it more precisely:

Files.lines(Paths.get("input.txt")).forEach(s -> stringBuffer.append(s);



Here is a sample with full error handling and supporting charset specification for pre-Java 7. With Java 7 you can use try-with-resources syntax, which makes the code cleaner.

If you just want the default charset you can skip the InputStream and use FileReader.

InputStream ins = null; // raw byte-stream
Reader r = null; // cooked reader
BufferedReader br = null; // buffered for readLine()
try {
    String s;
    ins = new FileInputStream("textfile.txt");
    r = new InputStreamReader(ins, "UTF-8"); // leave charset out for default
    br = new BufferedReader(r);
    while ((s = br.readLine()) != null) {
        System.out.println(s);
    }
}
catch (Exception e)
{
    System.err.println(e.getMessage()); // handle exception
}
finally {
    if (br != null) { try { br.close(); } catch(Throwable t) { /* ensure close happens */ } }
    if (r != null) { try { r.close(); } catch(Throwable t) { /* ensure close happens */ } }
    if (ins != null) { try { ins.close(); } catch(Throwable t) { /* ensure close happens */ } }
}

Here is the Groovy version, with full error handling:

File f = new File("textfile.txt");
f.withReader("UTF-8") { br ->
    br.eachLine { line ->
        println line;
    }
}



You can also use apache commons io:

File file = new File("/home/user/file.txt");
try {
    List<String> lines = FileUtils.readLines(file);
} catch (IOException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
}





Links