Java Streams Explained Through File Filtering Tasks

Java Streams have revolutionized the way we process collections and IO operations by introducing a functional and declarative style. In this post, we’ll dive into practical file filtering tasks using the Java Stream API. We’ll compare traditional imperative approaches with modern, fluent-style solutions powered by streams. If you’ve ever needed to filter files by extension, size, or modification date, then this post is for you.

1. Reading a Directory Imperatively vs Using Streams

First, let’s take a look at how we list files from a directory using both imperative and functional approaches.

// Imperative style
File folder = new File("/path/to/directory");
File[] files = folder.listFiles();
if (files != null) {
    for (File file : files) {
        System.out.println(file.getName());
    }
}

Now, let’s do the same using Java Streams:

// Functional style with Streams
try (Stream paths = Files.list(Paths.get("/path/to/directory"))) {
    paths.forEach(System.out::println);
} catch (IOException e) {
    e.printStackTrace();
}

Streams not only offer concise syntax, but they also bring in lazy evaluation, parallelization opportunities, and functional-style benefits.

2. Filtering Files by Extension

Let’s say you want to list only .txt files in a directory. Here’s the imperative version:

// Imperative filtering
File dir = new File("/path/to/directory");
for (File file : dir.listFiles()) {
    if (file.isFile() && file.getName().endsWith(".txt")) {
        System.out.println(file.getName());
    }
}

And now, the cleaner Stream-based solution:

// Stream-based filtering
try (Stream<Path> stream = Files.list(Paths.get("/path/to/directory"))) {
    stream
        .filter(Files::isRegularFile)
        .filter(path -> path.toString().endsWith(".txt"))
        .forEach(System.out::println);
} catch (IOException e) {
    e.printStackTrace();
}

This approach scales better, and thanks to fluent chaining, it leads to more readable code. If performance is a concern, consider using Files.walk with depth limits for recursive filtering.

3. Filtering Files by Size

Here’s a case where we want to list files larger than 1MB:

// Imperative size filter
long oneMB = 1024 * 1024;
File dir = new File("/path/to/directory");
for (File file : dir.listFiles()) {
    if (file.isFile() && file.length() > oneMB) {
        System.out.println(file.getName());
    }
}

Using streams and Files.size:

try (Stream<Path> stream = Files.list(Paths.get("/path/to/directory"))) {
    stream
        .filter(Files::isRegularFile)
        .filter(path -> {
            try {
                return Files.size(path) > 1024 * 1024;
            } catch (IOException e) {
                return false;
            }
        })
        .forEach(System.out::println);
} catch (IOException e) {
    e.printStackTrace();
}

Be aware that calling Files.size() can throw an IOException, so a try-catch inside the lambda is necessary.

4. Recursive File Search with Streams

Files.walk() allows you to recursively stream through directory trees. Suppose you want to find all .log files located deeper in a directory structure:

try (Stream<Path> stream = Files.walk(Paths.get("/path/to/directory"))) {
    stream
        .filter(Files::isRegularFile)
        .filter(path -> path.toString().endsWith(".log"))
        .forEach(System.out::println);
} catch (IOException e) {
    e.printStackTrace();
}

This approach is optimal for file trees and supports filtering by nested criteria.

5. Performing Batch Operations (e.g., Delete, Copy)

With streams, batch operations like deleting temp files or copying data files become highly expressive:

// Delete .tmp files
try (Stream<Path> stream = Files.walk(Paths.get("/path/to/directory"))) {
    stream
        .filter(Files::isRegularFile)
        .filter(path -> path.toString().endsWith(".tmp"))
        .forEach(path -> {
            try {
                Files.delete(path);
                System.out.println("Deleted: " + path);
            } catch (IOException e) {
                System.err.println("Failed to delete: " + path);
            }
        });
} catch (IOException e) {
    e.printStackTrace();
}

Or copy all .csv files to an archive directory:

Path archive = Paths.get("/archive");
try (Stream<Path> stream = Files.walk(Paths.get("/path/to/source"))) {
    stream
        .filter(Files::isRegularFile)
        .filter(path -> path.toString().endsWith(".csv"))
        .forEach(path -> {
            try {
                Path target = archive.resolve(path.getFileName());
                Files.copy(path, target, StandardCopyOption.REPLACE_EXISTING);
                System.out.println("Copied to: " + target);
            } catch (IOException e) {
                System.err.println("Copy failed: " + path);
            }
        });
} catch (IOException e) {
    e.printStackTrace();
}

When performing batch IO, always wrap individual operations in try-catch to prevent one failure from stopping your whole stream.

Conclusion

Using Java Streams for file operations leads to clean, declarative code that is easier to maintain and often more efficient. The combination of Files.walk(), filter(), and terminal operations like forEach() makes traversing and manipulating file systems a breeze. Plus, you can parallelize streams if working with CPU-bound logic or very large datasets.

Whether you’re filtering file extensions, sizes, or dates, the Stream API is a powerful tool to have in your Java toolkit. Start writing cleaner file-handling logic today using Streams!

Useful links: