CSV Export in Spring_Memory Challenges

  • 2024/8/15
  • CSV Export in Spring_Memory Challenges はコメントを受け付けていません

Continuing on the previous topic, I would like to give you the memory consumption results of each technology we discussed earlier. Before that, let’s have a basic understanding about Memory Management. Understanding where data is stored in the memory will help us prevent Stack Overflow and memory leaks.

1. Definition of Memory in Development Process

In the context of software development, “memory” typically refers to computer memory (RAM) and how it’s managed during the execution of a program. It involves the allocation, utilization, and release of memory resources by a program to store and process data. Proper memory management is crucial for efficient and stable software, as poor management can lead to issues such as memory leaks, excessive memory usage, and performance degradation.

Types of Memory in Development

Stack Memory:

The stack is a segment of memory that stores temporary variables created by a function in a last-in-first-out (LIFO) manner. In the stack, variables are declared, stored, and initialized during runtime. This means the size of the memory to be allocated is known to the compiler, and whenever a function is called, its variables (unless it were declared as static) are allocated memory on the stack. Once the function call is completed, the memory for those variables is deallocated. This process is managed by predefined routines in the compiler, so the programmer does not need to manually handle memory allocation and deallocation for stack variables. This type of memory allocation is also known as temporary memory allocation because as soon as the method finishes its execution, all data associated with that method is automatically flushed from the stack. Therefore, any value stored in the stack memory is accessible only as long as the method is still executing. Stack memory offers a speed advantage in allocating and deallocating memory, as it only requires adjusting a reference. It is generally used for storing small, short-lived variables, such as local variables and function parameters.

Heap Memory:

Heap memory is a crucial part of an application’s memory management, residing in the system’s Random Access Memory (RAM). It is where the application stores objects created during its execution. Unlike stack memory, which holds primitive data types and method frames, heap memory is designed for dynamic memory allocation and object storage.

When the Java Virtual Machine (JVM) starts, it allocates heap space, which remains active for the lifetime of the application. This area is shared across all threads, making it accessible throughout the entire application. Objects in the heap are referenced by variables stored in the stack, linking the two memory areas.

The heap is organized into different segments, or “generations,” to optimize memory management:

Young Generation: This is where new objects are initially allocated. It undergoes frequent garbage collection, known as minor GC, to free up memory.

Old or Tenured Generation: Objects that survive long enough in the Young Generation are moved here. This segment is cleaned up less frequently, through a process called major GC, which tends to be more time-consuming.

Permanent Generation (Metaspace in Java 8+): Originally used to store metadata for classes and methods, the Permanent Generation was replaced by Metaspace in Java 8, which offers better memory management for these elements.

** Permanent Generation (PermGen) / Metaspace:

The Permanent Generation, commonly known as PermGen, was a special memory region within the Java heap, used primarily for storing class metadata, static content, and other related data. Unlike the rest of the heap, which holds dynamic objects created during the program’s execution, PermGen contains information essential to the Java program’s structure, such as class definitions, method information, and interned strings.

PermGen had a fixed maximum size, which could lead to issues if the space filled up, particularly in applications that dynamically loaded many classes or had a long runtime. When PermGen ran out of space, it could cause OutOfMemoryError exceptions, making memory management in large applications challenging.

To address these limitations, Java 8 replaced PermGen with Metaspace, which offers more flexible memory management. Unlike PermGen, Metaspace can grow dynamically, adjusting its size as needed. Garbage collection in Metaspace is automatically triggered when class metadata usage reaches a certain threshold, improving overall memory management and reducing the likelihood of memory-related errors.

** Off-Heap Memory:

Off-heap memory refers to a memory area outside of the Java heap, which is not managed by the Java garbage collector. Unlike on-heap memory, where Java objects reside and are automatically managed by the JVM, off-heap memory allows developers to allocate memory manually, giving them more control over memory management.

Off-heap memory is particularly useful in scenarios where interaction with native code, such as C or C++, is required. In such cases, arguments passed to native functions need to reside in off-heap memory. Since this memory is not governed by the garbage collector, it avoids the overhead associated with garbage collection, potentially leading to performance improvements.

However, off-heap memory requires explicit management. Developers must carefully control the allocation and deallocation of this memory to prevent leaks or inefficient use of resources. This approach provides flexibility and efficiency but also places more responsibility on developers to manage memory manually.

2. Memory consumption

Returning to the main topic, below is the previous result that we’ve got (which is measured in seconds)

Techniques List Stream Dto
Batch 8163.085 465.023 129.655
BufferedWriter 18527.395 54.302 102.978
Apache Commons CSV 45.824 60.289 106.527
InputStreamResource 37.491 63.738 99.591

This is the memory consumption comparison table.

Techniques List Stream Dto
Batch 12MB 300MB 577MB
BufferedWriter 1681MB 2259MB 1351MB
Apache Commons CSV 2391MB 2100MB 824MB
InputStreamResource 1520MB 54MB 623MB

Based on the results, we are more certain in saying that InputStreamResource with Stream is the most efficient option for handling CSV export. 

3. Summary

Streams in Java offer a powerful way to process data but come with disadvantages like reduced readability, potential performance overhead, and limited control over execution. It’s essential to consider these trade-offs and select the appropriate approach based on your application’s needs and task complexity. CSV export is a common function in data management systems, so understanding these nuances can be beneficial.

関連記事

カテゴリー:

ブログ

情シス求人

  1. チームメンバーで作字やってみた#1

ページ上部へ戻る