CSV Export in Spring_Memory Challenges
Continuing on the previous topic, I would like to give you the memory consumption results of each technology we discussed earlier. Before that, let’s have a basic understanding about Memory Management. Understanding where data is stored in the memory will help us prevent Stack Overflow and memory leaks.
1. Definition of Memory in Development Process
In the context of software development, “memory” typically refers to computer memory (RAM) and how it’s managed during the execution of a program. It involves the allocation, utilization, and release of memory resources by a program to store and process data. Proper memory management is crucial for efficient and stable software, as poor management can lead to issues such as memory leaks, excessive memory usage, and performance degradation.
Types of Memory in Development
Stack Memory:
The stack is a segment of memory that stores temporary variables created by a function in a last-in-first-out (LIFO) manner. In the stack, variables are declared, stored, and initialized during runtime. This means the size of the memory to be allocated is known to the compiler, and whenever a function is called, its variables (unless it were declared as static) are allocated memory on the stack. Once the function call is completed, the memory for those variables is deallocated. This process is managed by predefined routines in the compiler, so the programmer does not need to manually handle memory allocation and deallocation for stack variables. This type of memory allocation is also known as temporary memory allocation because as soon as the method finishes its execution, all data associated with that method is automatically flushed from the stack. Therefore, any value stored in the stack memory is accessible only as long as the method is still executing. Stack memory offers a speed advantage in allocating and deallocating memory, as it only requires adjusting a reference. It is generally used for storing small, short-lived variables, such as local variables and function parameters.
Heap Memory:
Heap memory is a crucial part of an application’s memory management, residing in the system’s Random Access Memory (RAM). It is where the application stores objects created during its execution. Unlike stack memory, which holds primitive data types and method frames, heap memory is designed for dynamic memory allocation and object storage.
When the Java Virtual Machine (JVM) starts, it allocates heap space, which remains active for the lifetime of the application. This area is shared across all threads, making it accessible throughout the entire application. Objects in the heap are referenced by variables stored in the stack, linking the two memory areas.
The heap is organized into different segments, or “generations,” to optimize memory management:
Young Generation: This is where new objects are initially allocated. It undergoes frequent garbage collection, known as minor GC, to free up memory.
Old or Tenured Generation: Objects that survive long enough in the Young Generation are moved here. This segment is cleaned up less frequently, through a process called major GC, which tends to be more time-consuming.
Permanent Generation (Metaspace in Java 8+): Originally used to store metadata for classes and methods, the Permanent Generation was replaced by Metaspace in Java 8, which offers better memory management for these elements.
** Permanent Generation (PermGen) / Metaspace:
The Permanent Generation, commonly known as PermGen, was a special memory region within the Java heap, used primarily for storing class metadata, static content, and other related data. Unlike the rest of the heap, which holds dynamic objects created during the program’s execution, PermGen contains information essential to the Java program’s structure, such as class definitions, method information, and interned strings.
PermGen had a fixed maximum size, which could lead to issues if the space filled up, particularly in applications that dynamically loaded many classes or had a long runtime. When PermGen ran out of space, it could cause OutOfMemoryError exceptions, making memory management in large applications challenging.
To address these limitations, Java 8 replaced PermGen with Metaspace, which offers more flexible memory management. Unlike PermGen, Metaspace can grow dynamically, adjusting its size as needed. Garbage collection in Metaspace is automatically triggered when class metadata usage reaches a certain threshold, improving overall memory management and reducing the likelihood of memory-related errors.
** Off-Heap Memory:
Off-heap memory refers to a memory area outside of the Java heap, which is not managed by the Java garbage collector. Unlike on-heap memory, where Java objects reside and are automatically managed by the JVM, off-heap memory allows developers to allocate memory manually, giving them more control over memory management.
Off-heap memory is particularly useful in scenarios where interaction with native code, such as C or C++, is required. In such cases, arguments passed to native functions need to reside in off-heap memory. Since this memory is not governed by the garbage collector, it avoids the overhead associated with garbage collection, potentially leading to performance improvements.
However, off-heap memory requires explicit management. Developers must carefully control the allocation and deallocation of this memory to prevent leaks or inefficient use of resources. This approach provides flexibility and efficiency but also places more responsibility on developers to manage memory manually.
2. Memory consumption
Returning to the main topic, below is the previous result that we’ve got (which is measured in seconds)
Techniques | List | Stream | Dto |
Batch | 8163.085 | 465.023 | 129.655 |
BufferedWriter | 18527.395 | 54.302 | 102.978 |
Apache Commons CSV | 45.824 | 60.289 | 106.527 |
InputStreamResource | 37.491 | 63.738 | 99.591 |
This is the memory consumption comparison table.
Techniques | List | Stream | Dto |
Batch | 12MB | 300MB | 577MB |
BufferedWriter | 1681MB | 2259MB | 1351MB |
Apache Commons CSV | 2391MB | 2100MB | 824MB |
InputStreamResource | 1520MB | 54MB | 623MB |
Based on the results, we are more certain in saying that InputStreamResource with Stream is the most efficient option for handling CSV export.
3. Summary
Streams in Java offer a powerful way to process data but come with disadvantages like reduced readability, potential performance overhead, and limited control over execution. It’s essential to consider these trade-offs and select the appropriate approach based on your application’s needs and task complexity. CSV export is a common function in data management systems, so understanding these nuances can be beneficial.
この情報は役に立ちましたか?
カテゴリー: